Replies: 1 comment 1 reply
-
@YashPayU "NoSuchMethodError" indicates a version conflict between Another best practice is: instead of using the full |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Unable to save spark dataframe from local to s3 bucket.
Env Setup
Base image: gcr.io/datamechanics/spark:platform-3.2.1-hadoop-3.3.1-java-11-scala-2.12-python-3.8-dm18
I've following jars available in /opt/spark/jars: 'aws-java-sdk-bundle-1.11.901.jar', 'aws-java-sdk-core-1.11.797.jar', 'aws-java-sdk-glue-1.11.797.jar', 'hadoop-aws-3.3.1.jar',
Sample code
`from pyspark.sql import SparkSession
spark = SparkSession.builder
.config("spark.hadoop.fs.s3.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem")
.config("spark.dynamicAllocation.enabled", "true")
.config("spark.dynamicAllocation.maxExecutors", "4")
.config("spark.dynamicAllocation.minExecutors", "1")
.config("spark.dynamicAllocation.initialExecutors", "1")
.config("spark.sql.parquet.datetimeRebaseModeInRead", "CORRECTED")
.config("spark.sql.legacy.pathOptionBehavior.enabled", "true")
.config("spark.sql.parquet.datetimeRebaseModeInWrite", "CORRECTED")
.getOrCreate()
source_file = "/workspaces/sample/test/*"
df = spark.read.parquet(source_file)
df.write.format("parquet").mode("append").save("s3a://MY_BUCKET/MY_FOLDER/")`
ERROR: java.io.IOException: regular upload failed: java.lang.NoSuchMethodError: 'void com.amazonaws.util.IOUtils.release(java.io.Closeable, com.amazonaws.thirdparty.apache.logging.Log)'
Beta Was this translation helpful? Give feedback.
All reactions