Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Website resolver usually fails #86

Closed
fuddl opened this issue Jun 19, 2022 · 10 comments
Closed

Website resolver usually fails #86

fuddl opened this issue Jun 19, 2022 · 10 comments

Comments

@fuddl
Copy link
Owner

fuddl commented Jun 19, 2022

https://www.saint-ouen.fr/ should resolve to https://www.wikidata.org/wiki/Q208889 but doesn't

from #84

@fuddl
Copy link
Owner Author

fuddl commented Jun 19, 2022

@teolemon in this particular scenario it fails because the Q208889 had the url with http while the website used https 😅
Screen Shot 2022-06-19 at 17 26 04

@teolemon
Copy link

Yes that happens with so many stored urls, probably worthwhile handling it in the code (eg stripping prefixes if it's just for detection ?)

@teolemon
Copy link

Ah you created an issue for it. Sorry, I should have explicitly pointed it in the other issue

@teolemon
Copy link

Sorry if I lost you time scratching your head

@fuddl
Copy link
Owner Author

fuddl commented Jun 19, 2022

nah, it's fine 🤷

the query alpready looks like this, we could add another dimension.

SELECT ?item {
  {
    ?item wdt:P953 <https://www.saint-ouen.fr/>.
  } UNION {
    ?item wdt:P973 <https://www.saint-ouen.fr/>.
  } UNION {
    ?item wdt:P856 <https://www.saint-ouen.fr/>.
  } UNION {
    ?item wdt:P2699 <https://www.saint-ouen.fr/>.
  } UNION {
    ?item wdt:P953 <https://www.saint-ouen.fr>.
  } UNION {
    ?item wdt:P973 <https://www.saint-ouen.fr>.
  } UNION {
    ?item wdt:P856 <https://www.saint-ouen.fr>.
  } UNION {
    ?item wdt:P2699 <https://www.saint-ouen.fr>.
  }
}

don't know if it is possible to have wildcards in urls though

@derenrich
Copy link

I just ran into this when playing with the extension. https://www.hansonrobotics.com/ should resolve to https://www.wikidata.org/wiki/Q48999902

I think you can make this work without using SPARQL maybe with wbsearchentities which will more flexibly match

@fuddl
Copy link
Owner Author

fuddl commented Oct 2, 2022

I just ran into this when playing with the extension. https://www.hansonrobotics.com/ should resolve to https://www.wikidata.org/wiki/Q48999902

right now it does. One issue is that the extension doesn't remember the edit it made right away. The resolver only kicks in when the Sparql api gives the correct answer. This should be easy to fix thougt. (by adding it to the internal cache)

I think you can make this work without using SPARQL maybe with wbsearchentities which will more flexibly match

Can you show me an example how to do that? Or is it documented somewhere?

@derenrich
Copy link

Yeah I figured there was a caching issue with that edit but it should've worked before the edit by fuzzy matching on https (as you described above).

API docs are at https://www.wikidata.org/w/api.php?action=help&modules=wbsearchentities

@fuddl
Copy link
Owner Author

fuddl commented Oct 2, 2022

Ill try to fix the cache first

@fuddl
Copy link
Owner Author

fuddl commented Oct 3, 2022

in version .273 the resolver should be a little more reliable.

@teolemon @derenrich You can try this by adding 'official website' or 'described by url' to an existing item and then reload or change tabs. The extension should now be able to find the previously connected item instantly. I also fixed a scenario where the website does not end on a /.

Ill close this ticket now, since unfortunatly it is not very well defined. Feel free to open a new one if issues occour 🙏. Please try to describe what behaviour you expect and what is happening instead. The url resolver is whacky by nature, a url is not an id 😭

@fuddl fuddl closed this as completed Oct 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants