Skip to content

HTTPError: 500 Server Error: Internal Server Error #1479

Closed
@petertevans

Description

@petertevans

I'm working on a simple NLP project and am trying to connect to the CoreNLP server via Jupyter Notebook. When I pass my .txt file to the server for parsing, it returns this error:

HTTPError: 500 Server Error: Internal Server Error for url: http://localhost:9000/?properties=%7B%22outputFormat%22%3A+%22json%22%2C+%22annotators%22%3A+%22tokenize%2Cpos%2Clemma%2Cssplit%2Cparse%22%7D

When I follow the url and simply paste the text of my file, it parses just fine.

Here's my simple script:

<><><><><><><><><><><><><><><><><><><>><><

from nltk.parse.corenlp import CoreNLPParser
from pprint import pprint
import io

parser = CoreNLPParser()

with open("INPUT_FILE.txt", "r", encoding="utf-8") as file:
text = file.read() # Read entire content

parsed_sentences = list(parser.parse_text(text))

with open("OUTPUT_FILE.txt", "w", encoding="utf-8") as output_file:
output_file.write(tree.pformat() + "\n\n")

<><><><><><><><><><><><><><><><><><><>><><

I'm using the stanford-corenlp-4.5.8.jar, and the stanford-corenlp-4.5.8-models.jar.

Here's the full error message I receive:

<><><><><><><><><><><><><><><><><><><>><><

HTTPError Traceback (most recent call last)
Cell In[8], line 13
10 text = file.read() # Read entire content
12 # Parse the text
---> 13 parsed_sentences = list(parser.parse_text(text))
15 ## Write parsed trees to a file
16 with open("HHB_c1_syntax_diagram.txt", "w", encoding="utf-8") as output_file:

File ~/Library/Python/3.11/lib/python/site-packages/nltk/parse/corenlp.py:303, in GenericCoreNLPParser.parse_text(self, text, *args, **kwargs)
294 def parse_text(self, text, *args, **kwargs):
295 """Parse a piece of text.
296
297 The text might contain several sentences which will be split by CoreNLP.
(...)
301
302 """
--> 303 parsed_data = self.api_call(text, *args, **kwargs)
305 for parse in parsed_data["sentences"]:
306 yield self.make_tree(parse)

File ~/Library/Python/3.11/lib/python/site-packages/nltk/parse/corenlp.py:255, in GenericCoreNLPParser.api_call(self, data, properties, timeout)
245 default_properties.update(properties or {})
247 response = self.session.post(
248 self.url,
249 params={"properties": json.dumps(default_properties)},
(...)
252 timeout=timeout,
253 )
--> 255 response.raise_for_status()
257 return response.json(strict=self.strict_json)

File /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/requests/models.py:1024, in Response.raise_for_status(self)
1019 http_error_msg = (
1020 f"{self.status_code} Server Error: {reason} for url: {self.url}"
1021 )
1023 if http_error_msg:
-> 1024 raise HTTPError(http_error_msg, response=self)

HTTPError: 500 Server Error: Internal Server Error for url: http://localhost:9000/?properties=%7B%22outputFormat%22%3A+%22json%22%2C+%22annotators%22%3A+%22tokenize%2Cpos%2Clemma%2Cssplit%2Cparse%22%7D

<><><><><><><><><><><><><><><><><><><>><><

The text is one chapter of a book; I tried uploading the whole book text to the online CoreNLP version 4.5.8, but apparently it was over 200K tokens and the server only handles up to 100K. I thought maybe cutting down to a single chapter would work, but it did not seem to make a difference. Any ideas what's amiss? Appreciate any help.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions