-
Notifications
You must be signed in to change notification settings - Fork 712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Py4JError: An error occurred while calling o9368.fit #14375
Comments
Could you please provide the actual code you used to start SparkSession, the pipeline, so we can reproduce it? |
The zip file i attached has .ipynb file which consist of the code |
Please include the code here or on Google Colab. We are not allowed to download and open zip files for security reasons. You just need to follow the template, nothing more and nothing less. The issue template is designed based on years of experience. |
import sparknlp spark = sparknlp.start() documentAssembler = DocumentAssembler() sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl","xx") tokenizer = Tokenizer() POSTag = PerceptronModel.pretrained() chunker = Chunker() embeddings = WordEmbeddingsModel.pretrained("glove_100d") ner_model =NerDLApproach() ner_converter = NerConverter() c_pipeline = Pipeline(stages=[ import pandas as pd #df = spark.read.csv("pii_dataset.csv", header=True, inferSchema=True) df=pd.read_csv("pii_dataset.csv") df1 = spark.createDataFrame(df) f_model=c_pipeline.fit(df1) #result.select( explode(col("chunk.result")).alias("chunk_tag")).show(truncate=False) df_new = df1.join(result.select("text", "pos.result"), on="text", how="left") #df_new1 = df_new.join(result.select("text", "chunk.result"), on="text", how="left") import ast df_new2=df_new.toPandas() selected_df=spark.createDataFrame(df_new2) def convert_to_conll(sentences): conll_data = convert_to_conll(rows_as_dicts) with open('annotations.conll', 'w') as file: print("Dataset converted to CoNLL format and saved as 'annotations.conll'.") nerpipeline = Pipeline(stages=[ from sparknlp.training import CoNLL conll_instance = CoNLL() training_data = conll_instance.readDataset(spark=spark, path ='annotations.conll') model = nerpipeline.fit(training_data) |
Is there an existing issue for this?
Who can help?
No response
What are you working on?
I am training NerDLApproach for custom entities. when I increase the size of training data. i am getting this error msg Py4JError: An error occurred while calling o9368.fit and connection is refused
Current Behavior
i am getting this error msg Py4JError: An error occurred while calling o9368.fit and connection is refused
Expected Behavior
To get trained and model training should complete and then can be used for NER of new text
Steps To Reproduce
CoNll.zip
Spark NLP version and Apache Spark
i have launched johnsnowlab on ec2 instance of m5.2xlarge type
Type of Spark Application
Python Application
Java Version
No response
Java Home Directory
No response
Setup and installation
sparkNLP in johnsnowlab
Operating System and Version
No response
Link to your project (if available)
No response
Additional Information
please let me know if any information is needed
The text was updated successfully, but these errors were encountered: