You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What is the correct format for a PDF file that GROBID can detect references in? I create PDFs myself, and sometimes they work and sometimes they don’t. I’m not sure about the formatting rules. Can you please let me know?
The text was updated successfully, but these errors were encountered:
With "detect references" do you mean, detect reference callout (e.g. In previous work [1] we showed that...)? or references sections in the article?
For the first case, there is generally not much training data in grobid (Fulltext model), but maybe it's easier if you show me some examples of your generated documents.
There are no "rules" to format a document so that Grobid recognise the references. It's more like, to make a document like a scientific article.
At a first glance, these document' format is a bit far from the layout of a scientific article. For example, there is no header (at least title and authors) and the page layout is also horizontal (landscape).
Then, most important, the references don't match the text, so is normal that Grobid does not extract them correctly.
I did adjust your document and now with some more consistency looks much better ;-) Although, the body look indeed like an abstract: Untitled.pdf Untitled.pdf.tei.xml.zip
lfoppiano
added
bug
From Hemiptera and especially its suborder Heteroptera
question
There's no such thing as a stupid question
and removed
bug
From Hemiptera and especially its suborder Heteroptera
labels
Aug 13, 2024
What is the correct format for a PDF file that GROBID can detect references in? I create PDFs myself, and sometimes they work and sometimes they don’t. I’m not sure about the formatting rules. Can you please let me know?
The text was updated successfully, but these errors were encountered: