Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jars Incompatible issue for spark upgrade from 2 to 3. Caused by: java.io.InvalidClassException: com.johnsnowlabs.nlp.util.regex.RuleFactory; #14463

Open
shaheenshanavaz opened this issue Nov 25, 2024 · 0 comments

Comments

@shaheenshanavaz
Copy link

shaheenshanavaz commented Nov 25, 2024

I am working on a model upgrade which is currently using spark2 and now moving to spark 3, for this we are using manual jars.
The model is failing due to incompatible jars

Failure:
"Caused by: java.io.InvalidClassException: com.johnsnowlabs.nlp.util.regex.RuleFactory;local class incompatible: stream classdesc serialVersionUID = someserialnumber local class serialVersionUID = someserialnumber"
I am facing this error while using the below jars


These are the version used for upgrade(spark 3): |
____________________________________________________|
nlp --> 2.12-5.5.1
nlp-assembler -->2.5.5
Tensorflow -->1.15.5-1.5.5
Graphframes --> 0.6.0-spark2.3-s_2.11
parso -->2.0.14.jar
spark-sas7bat-2.1.0-s_2.11.jar
Python --> 3.9
Java --> 8

Existing Jar File present (spark2):
spark-nlp -->2.11-2.4.3.jar
spark-assembler --> 2.1.0-s_2.11.jar
parso-2.0.10.jar
spark-nlp-assembler-2.5.5.jar
tensorflow-1.13.1.jar
graphframes-0.6.0-spark2.3-s_2.11.jar

Please help in resolving this issue.
Kindly let me know if more information is required.

Here is the code were the jars are specified:

export spark_Exec="spark3-submit
--master yarn --queue ${queue_serv}
--conf spark.executorEnv.pyspark_python=$pyspark_python"
--conf spark.executorEnv.PYTHONPATH_PYTHON=$PYTHONPATH_PYTHON
--conf spark.yarn.appMasterEnv.PYSPARK_PYTHON=$PYSPARK_PYTHON
--driver-memory=8G
--executor-memory=8G
--conf spark.driver.maxResultSize=10G
--conf spark.mesos.executor.memoryOverhead=600
--jars ${JAR_DIR}/spark-nlp_2.12-5.5.1.jar,${JAR_DIR}/spark-sas7bdat-2.1.0-s_2.11.jar,
${JAR_DIR}/parso-2.0.14.jar,${JAR_DIR}/spark-nlp-assembly-2.5.5.jar,${JAR_DIR}/tensorflow-1.15.5.jar,
${JAR_DIR}/graphframes-0.6.0-spark2.3-s_2.11.jar"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant