-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
extend to biorxiv #5
Comments
The hardest part would be building the citation network. We did this for the arXiv by parsing the source files (mostly TeX) for their references and then attaining a high match rate between this reference info and the referred to arXiv papers. In the many cases where no arXiv or DOI identifiers are available the next best thing is the journal information, which requires specialized regex to parse well. High energy physics has the best representation in Paperscape partly because we were both working in the field at the time and were most familiar with its journals (but also because arXiv usage and referencing is generally better in this field). Generating a paperscape-like map from another citation network is relatively straight forward, |
Hey I would help with the extension to biorxiv if you are interested. I am a CS PhD student but I work with computational models of ecological systems so I might be able to help for the same reason that high energy physics has the best representation in Paperscape |
@sfrosenb if you want to help with biorxiv that would be great. But note that the maintainers of this project (@rknegjens and myself) are mostly busy with other things now so won't have much time to help out here. As mentioned above, the main thing to do is to extract the citation network from biorxiv. Do they provide such a thing already? Do they provide source code or downloadable forms of their paper database? |
How hard would it be to extend this work to data on biorXiv, the archive for biology? And, even better, how hard would it be to connect the biorxiv papers to the arXiv ones?
The text was updated successfully, but these errors were encountered: