Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scispacy library not working with medspacy. #513

Open
DeFrayne opened this issue May 2, 2024 · 2 comments
Open

Scispacy library not working with medspacy. #513

DeFrayne opened this issue May 2, 2024 · 2 comments

Comments

@DeFrayne
Copy link

DeFrayne commented May 2, 2024

Hello, I am trying to use the en_ner_bionlp13cg_md model with medspacy. This only seems to work if I disable the parser, which is a major appeal of medspacy, as seen below:
nlp = medspacy.load("en_ner_bionlp13cg_md", disable=['parser'])

This is successful, but I lose parsing.

If I run the following:
nlp = medspacy.load("en_ner_bionlp13cg_md")
text = "blahblahblah"
doc = nlp(text)
visualize_ent(doc)

I get the following error:
ValueError Traceback (most recent call last)
Input In [86], in <cell line: 2>()
1 text = "blahblahblah"
----> 2 doc = nlp(text)
3 visualize_ent(doc)

File c:\Users\ddefr\anaconda3\lib\site-packages\spacy\language.py:1054, in Language.call(self, text, disable, component_cfg)
1052 raise ValueError(Errors.E109.format(name=name)) from e
1053 except Exception as e:
-> 1054 error_handler(name, proc, [doc], e)
1055 if not isinstance(doc, Doc):
1056 raise ValueError(Errors.E005.format(name=name, returned_type=type(doc)))

File c:\Users\ddefr\anaconda3\lib\site-packages\spacy\util.py:1722, in raise_error(proc_name, proc, docs, e)
1721 def raise_error(proc_name, proc, docs, e):
-> 1722 raise e

File c:\Users\ddefr\anaconda3\lib\site-packages\spacy\language.py:1049, in Language.call(self, text, disable, component_cfg)
1047 error_handler = proc.get_error_handler()
1048 try:
-> 1049 doc = proc(doc, **component_cfg.get(name, {})) # type: ignore[call-arg]
1050 except KeyError as e:
1051 # This typically happens if a component is not initialized
1052 raise ValueError(Errors.E109.format(name=name)) from e

File c:\Users\ddefr\anaconda3\lib\site-packages\PyRuSH\PyRuSHSentencizer.py:53, in PyRuSHSentencizer.call(self, doc)
51 def call(self, doc):
52 tags = self.predict([doc])
---> 53 cset_annotations([doc], tags)
54 return doc

File c:\Users\ddefr\anaconda3\lib\site-packages\PyRuSH\StaticSentencizerFun.pyx:48, in PyRuSH.StaticSentencizerFun.cset_annotations()

File c:\Users\ddefr\anaconda3\lib\site-packages\PyRuSH\StaticSentencizerFun.pyx:56, in PyRuSH.StaticSentencizerFun.cset_annotations()

File c:\Users\ddefr\anaconda3\lib\site-packages\spacy\tokens\token.pyx:509, in spacy.tokens.token.Token.sent_start.set()

File c:\Users\ddefr\anaconda3\lib\site-packages\spacy\tokens\token.pyx:528, in spacy.tokens.token.Token.is_sent_start.set()

ValueError: [E043] Refusing to write to token.sent_start if its document is parsed, because this may cause inconsistent state.

Any assistance in resolving this is greatly appreciated. I do not have this error if I use spacy.load(), only medspacy.load().

@dakinggg
Copy link
Collaborator

dakinggg commented Jun 7, 2024

I'm a bit confused why you are trying to used medspacy with a scispacy model. Is that expected to work?

@DeFrayne
Copy link
Author

DeFrayne commented Jun 8, 2024

Yes - I just ended up disabling the parser and going with it. It works fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants