You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is there support of the 2.X.X versions of Apache Spark?
Further Information
I see in pyproject.toml pyspark 3.2.0 dependency. But in real enerprise and on-premise clusters typically version is 2.X.X. Is there support of any Spark version except 3.2.0?
Screenshots
If applicable, add screenshots to help explain your question.
System Information
OS: RHEL
OS Version: 8
Language Version: 3.7
Package Manager Version: PIP
Additional Context
It is good to see the list of supported Spark/Besm versions but I couldn't find it. Maybe there is one? In that case could you please get me a link? Thank you!
The text was updated successfully, but these errors were encountered:
We haven't tested yet on 2.X, though I think it should be easy to make support 2.X (or even it might work with 2.X out of the box). That's because PipelineDP needs only some basic APIs from RDD (no yet support of other Spark API as DataFrames) - like map, reduceByKey, join etc. You can see all used Spark API in SparkRDDBackend class. If you have any feedback on using Spark please LMK. Also if you test it with Spark 2.* please LMK results.
In the next release, we will remove limitation on 3.2.0.
Question
Is there support of the 2.X.X versions of Apache Spark?
Further Information
I see in pyproject.toml pyspark 3.2.0 dependency. But in real enerprise and on-premise clusters typically version is 2.X.X. Is there support of any Spark version except 3.2.0?
Screenshots
If applicable, add screenshots to help explain your question.
System Information
Additional Context
It is good to see the list of supported Spark/Besm versions but I couldn't find it. Maybe there is one? In that case could you please get me a link? Thank you!
The text was updated successfully, but these errors were encountered: