Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wikidata Q12 is buggy #11

Open
VladimirAlexiev opened this issue Sep 25, 2022 · 5 comments
Open

Wikidata Q12 is buggy #11

VladimirAlexiev opened this issue Sep 25, 2022 · 5 comments

Comments

@VladimirAlexiev
Copy link

https://viziquer.lumii.lv/examples/wikidata2022/SPARQL_to_ViziQuer_wikidata.pdf

# ID = 12,
# Question = Recent events
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?event ?date ?eventLabel WHERE{
 ?event wdt:P31/wdt:P279* wd:Q1190554.
 OPTIONAL{?event wdt:P585 ?date.}
 OPTIONAL{?event wdt:P580 ?date.}
 OPTIONAL{?event rdfs:label ?eventLabel. FILTER(LANG(?eventLabel) = 'en')}
 BIND(NOW()-?date AS ?distance)
 FILTER(BOUND(?date) && DATATYPE(?date) =xsd:dateTime)
 FILTER(0 <= ?distance && ?distance < 31) }
LIMIT 10

This query is buggy:

  • it will skip events that have both wdt:P585 and wdt:P580 but they disagree
  • if an event has multiple values for the date fields (and none is Deprecated nor Preferred), it will be returned multiple times same. See the query below (/wdt:P279* removed because it causes a timeout)
select ?event ?eventLabel ?date1 ?date2 {
   ?event wdt:P31 wd:Q1190554.
   ?event wdt:P585 ?date1,?date2.
   filter(?date1<?date2)
   SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
} limit 10

The bugs come from the original WD query

@karlisc
Copy link
Member

karlisc commented Sep 27, 2022

Thanks for the notice!
In fact, we did not check the validity of the original query, in this work we took the SPARQL queries as they are (as they were at the point we considered them) and tried to see, what can we do with the generation of the visual form.
I might come up with some more specific comments on this query in a couple of days. Meanwhile, if you think that you have a better SPARQL query, we could try to visualize that.

@karlisc
Copy link
Member

karlisc commented Nov 1, 2022

The following SPARQL query (perhaps corresponding better to the textual formulation)

PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?event ?eventLabel (MIN(?date) AS ?mindate) WHERE{
  ?event wdt:P31/wdt:P279* wd:Q1190554.
  ?event (wdt:P585|wdt:P580) ?date.
  OPTIONAL{?event rdfs:label ?eventLabel. FILTER(LANG(?eventLabel) = 'en')}
  BIND(NOW()-?date AS ?distance)
  FILTER(BOUND(?date) && DATATYPE(?date) = xsd:dateTime)
  FILTER(0 <= ?distance && ?distance < 31)
}
GROUP BY ?event ?eventLabel
LIMIT 10

can be visualized, as follows:
image
We are working to tune the path expressions of properties enriched with labels in the attribute position to allow a presentation like the following one (currently not yet working in full):
image

@karlisc
Copy link
Member

karlisc commented Nov 1, 2022

The query visualization and visual query creation over various data endpoints, including wikidata and DBPedia, can now be done at https://viziquer.app by any interested person (the query libraries can be preloaded, looked at and modified).

@VladimirAlexiev
Copy link
Author

VladimirAlexiev commented Nov 6, 2022

Thanks @karlisc !

  • ?event wdt:P31/wdt:P279* wd:Q1190554. times out for me. Do you run the query in some special way?
  • FILTER(BOUND(?date) && DATATYPE(?date) = xsd:dateTime) is not needed because
    • ?event (wdt:P585|wdt:P580) ?date. guarantees it'll be bound
    • WD guarantees that these props always record xsd:dateTime (plus extra "structured value" props to indicate the granularity of that timestamp, eg days, months, years or decades)

@karlisc
Copy link
Member

karlisc commented Nov 23, 2022

Thanks @VladimirAlexiev for the note. The query with wdt:P31/wdt:P279* times out for me, as well. I am not sure, if we could be supposed to do anything about that (rather not). This might be related to the more general understanding that the Blazegraph-based wikidata endpoint is close to its technical limits. I could think of developing certain services for wikidata-specific visual queries (e.g. including dedicated support for qualifiers), however, it would make much more sense, if custom SPARQL were a reliable means for information extraction from wikidata.
Regarding this ticket, I would see that we would need to implement fully the other visual form (the one with the single orange box), whatever long it takes (due to different work priorities), then it perhaps could be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants