-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple hits on a single index #44
Comments
This doesn't look intended to me - looks like a very nice bug that deserves a very nice test case. Your workaround sounds right to me in the interim. Thanks for filing this, I'll work out a fix to this soon. I'm hoping to spend a bunch of time tomorrow or Friday on many of this project's open tickets. |
Never make a promise on timeline on a Github ticket! I got swamped, mostly with the /licensing repo. I'll get to this, and the other tickets, soon. |
So this is actually expected behavior. When it detects cites for which it can't know whether or not it's a single section with a hyphen, or two sections -- it returns all of them, erring on the side of letting the user decide. Here's the code dealing with establishing ambiguity and parsing of ranges: This is because I built it originally to support a search engine, where you'd want to turn up too many results instead of too few. For a markup tool, I can see why you'd want to be stricter about it. But ultimately, the problem is that we can't always be certain whether a hyphen indicates two sections or one. There are a couple unambiguous situations -- if there's a double section symbol ( @phearlez, how do you want to handle ambiguous sections? We could add an option that gets passed into the |
For my purposes it's sufficient to have the more aggressive hit listed first; I've coped with this already by simply always using the first hit, basically assuming that the more "greedy" hit will be ordered first. An option to avoid ambiguity would be fine as well. Our auto-tagging runs on an assumption that there will still be some human eyes on things eventually; it's a helping hand, not a replacement for involvement. So I'm open to either way. |
Maybe this isn't a bug but an intended behavior? I'm not sure why that would be, unless it's a presentation of ambiguity in section identification to let the end user decide?
in 113hr2642eh we get three hits (and the same index) on the same string:
I'm just going to skip subsequent hits on the same index and use the first one I find to cope with this for the moment.
The text was updated successfully, but these errors were encountered: