-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve name parsing #66
Comments
Hey. I stumbled on this project and figured you might want to re-use some of my stuff. I've made a pretty decent bibtex parser: https://github.com/digitalheir/bibtex-js/ See if you can use it |
Note: known issue of the current parser is that it can't handle lowercase particles in the middle of the family name (i.e. @digitalheir I'll take a look at it, thanks! |
Don't try to be smarter than the spec, I guess. :^) Standard BibTeX behaviour is to treat all capitalized names before "y" as first names, ie Firstname von Lastname. If user wants Ruiz it to be last names user should re-format the field as See function |
Well, this name parsing function is used in other parsers (like Wikidata) as well, so I was talking more generally. |
Ah yeah. Parsing names can be a real headache generally. I had the same problem when I tried to look for Dutch names in a big collection of text files. In the end I just prepared a database of known last names: https://github.com/digitalheir/family-names-in-the-netherlands Meertens also keep a list of first names but I think it's a little harder to scrape. |
So there's another bug... 'First M. Last, Jr.' => {given: 'Jr.', family: 'First M. Last'} |
Examples:
Possible solutions:
The text was updated successfully, but these errors were encountered: