-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Multi-Modal Image+Text] Explore available software packages to pre-process the text reports. #36
Comments
Here is a link to the standford software: it is based on this link: |
Relevant MICCAI 2018 paper: TextRay: Mining Clinical Reports to Gain a |
In this paper, they used a tool from NIH called Medical Text Indexer. Here is what they did: This might be helpful for tagging. Please take a look. |
@pyadolla would you please add the results of the CliNER here for the record. |
@Sumedha |
@sumedhasingla if you got some preliminary results from TIES, paste an example here. |
NOBLE Tool, extensively tags the reports with the concepts + semantic type with a chosen thesaurus. I am using "NCI_Metathesaurus". The concepts found by NOBLe are used as input for pyContext to find the negations. |
TIES annotation tool, cannot run the reports we have in RAD-ALL.deid as these reports were directly extracted from MARS and there is no way to query them or find them through TIES interface. To process and get tags using TIES tool, we again have to extract reports from the TIES (500k) and save annotation information with the report. We ran this process through a small sample of about 5k reports. The results are at location: /pghbio/dbmi/batmanlab/Data/radiologyTextDataset2/Reports/test-concepts The problem with this approach is, TIES can handle these annotation for only 5k files at a time. The process have to re-run after every 5K reports. Also, while building query in TIES to extract reports, the queries should be such that the number of reports , resulted from the query is atmost 5K. As, TIES uses NOBLE Tool under the hood. So may be we can skip TIES annotation. |
Location: /pghbio/dbmi/batmanlab/Data/radiologyTextDataset2/singla/RAD-ALL.deid
Keyword or concept tagging
Noble Coder Named Entity Recognition (NER) engine for biomedical text. DBMI tool . Can be used with TIES
TIES inbuild annotation tool
Work with Mike to get the annotations which TIES stores for each report.
Apache cTAKES
Negation identifier
- https://github.com/chapmanbe/negex
Pre-processing for VQA
Pre-processing for Image Captioning
The text was updated successfully, but these errors were encountered: