Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTPError: 500 Server Error: Internal Server Error #1479

Open
petertevans opened this issue Feb 26, 2025 · 3 comments
Open

HTTPError: 500 Server Error: Internal Server Error #1479

petertevans opened this issue Feb 26, 2025 · 3 comments

Comments

@petertevans
Copy link

I'm working on a simple NLP project and am trying to connect to the CoreNLP server via Jupyter Notebook. When I pass my .txt file to the server for parsing, it returns this error:

HTTPError: 500 Server Error: Internal Server Error for url: http://localhost:9000/?properties=%7B%22outputFormat%22%3A+%22json%22%2C+%22annotators%22%3A+%22tokenize%2Cpos%2Clemma%2Cssplit%2Cparse%22%7D

When I follow the url and simply paste the text of my file, it parses just fine.

Here's my simple script:

<><><><><><><><><><><><><><><><><><><>><><

from nltk.parse.corenlp import CoreNLPParser
from pprint import pprint
import io

parser = CoreNLPParser()

with open("INPUT_FILE.txt", "r", encoding="utf-8") as file:
text = file.read() # Read entire content

parsed_sentences = list(parser.parse_text(text))

with open("OUTPUT_FILE.txt", "w", encoding="utf-8") as output_file:
output_file.write(tree.pformat() + "\n\n")

<><><><><><><><><><><><><><><><><><><>><><

I'm using the stanford-corenlp-4.5.8.jar, and the stanford-corenlp-4.5.8-models.jar.

Here's the full error message I receive:

<><><><><><><><><><><><><><><><><><><>><><

HTTPError Traceback (most recent call last)
Cell In[8], line 13
10 text = file.read() # Read entire content
12 # Parse the text
---> 13 parsed_sentences = list(parser.parse_text(text))
15 ## Write parsed trees to a file
16 with open("HHB_c1_syntax_diagram.txt", "w", encoding="utf-8") as output_file:

File ~/Library/Python/3.11/lib/python/site-packages/nltk/parse/corenlp.py:303, in GenericCoreNLPParser.parse_text(self, text, *args, **kwargs)
294 def parse_text(self, text, *args, **kwargs):
295 """Parse a piece of text.
296
297 The text might contain several sentences which will be split by CoreNLP.
(...)
301
302 """
--> 303 parsed_data = self.api_call(text, *args, **kwargs)
305 for parse in parsed_data["sentences"]:
306 yield self.make_tree(parse)

File ~/Library/Python/3.11/lib/python/site-packages/nltk/parse/corenlp.py:255, in GenericCoreNLPParser.api_call(self, data, properties, timeout)
245 default_properties.update(properties or {})
247 response = self.session.post(
248 self.url,
249 params={"properties": json.dumps(default_properties)},
(...)
252 timeout=timeout,
253 )
--> 255 response.raise_for_status()
257 return response.json(strict=self.strict_json)

File /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/requests/models.py:1024, in Response.raise_for_status(self)
1019 http_error_msg = (
1020 f"{self.status_code} Server Error: {reason} for url: {self.url}"
1021 )
1023 if http_error_msg:
-> 1024 raise HTTPError(http_error_msg, response=self)

HTTPError: 500 Server Error: Internal Server Error for url: http://localhost:9000/?properties=%7B%22outputFormat%22%3A+%22json%22%2C+%22annotators%22%3A+%22tokenize%2Cpos%2Clemma%2Cssplit%2Cparse%22%7D

<><><><><><><><><><><><><><><><><><><>><><

The text is one chapter of a book; I tried uploading the whole book text to the online CoreNLP version 4.5.8, but apparently it was over 200K tokens and the server only handles up to 100K. I thought maybe cutting down to a single chapter would work, but it did not seem to make a difference. Any ideas what's amiss? Appreciate any help.

@AngledLuffa
Copy link
Contributor

I don't know anything about the internal workings of the NLTK CoreNLPParser() interface. Does it print out any further information, such as what happens when it tries to connect to CoreNLP or what happens when it runs the query? Is it just the parser, or is it using other annotators as well?

Personally I would suggest contacting the NLTK folks with this issue, since they're more likely to know how their interface works.

@petertevans
Copy link
Author

petertevans commented Feb 26, 2025

Thanks for your reply. Below is how I initially connected to CoreNLP (I followed the steps/instructions included here: https://bbengfort.github.io/2018/06/corenlp-nltk-parses/). As far as I know, I was simply trying to run the parser; I wanted to analyze the syntax diagrams and trees.

<><><><><><><><><><><><><><><><><><>

import os
from nltk.parse.corenlp import CoreNLPServer

STANFORD = '/Path/to/my/file'

server = CoreNLPServer(
os.path.join(STANFORD, "stanford-corenlp-4.5.8.jar"),
os.path.join(STANFORD, "stanford-corenlp-4.5.8-models.jar"),
)

try:
print("Starting CoreNLP server...")
server.start()
print("Server started successfully!")
except Exception as e:
print("Error starting server:", e)

<><><><><><><><><><><><><><><><><><>

My previous comment included all the output from the error message. There was nothing additional to include. I'll try to contact the NLTK folks and see what they say. Thanks for your help.

@AngledLuffa
Copy link
Contributor

There is also a similar client in Stanza for connecting to CoreNLP. We would be better able to support that tool

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants