Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Markdown generated from Wikipedia missing numerical data #1160

Open
ckadner opened this issue Jun 4, 2024 · 3 comments
Open

Markdown generated from Wikipedia missing numerical data #1160

ckadner opened this issue Jun 4, 2024 · 3 comments

Comments

@ckadner
Copy link
Collaborator

ckadner commented Jun 4, 2024

The Markdown version of the Wikipedia page is missing some numbers, like the height of the 265 m (869 ft) high cliffs surrounding the island

Wikipedia:

Geography and Geology

Most of Prince Leopold Island rises steeply from the surrounding sea,
with cliffs rising as high as 265 m (869 ft) above sea level surrounding
most of the island.
...
The majority of the island's area of 68 km2 (26 sq mi) is occupied by a
plateau generally ranging in altitude between around 240 m (790 ft) and
300 m (980 ft)
.

Extracted Markdown:

Geography and Geology

Most of Prince Leopold Island rises steeply from the surrounding sea,
with cliffs rising as high as above sea level surrounding most of the
island.
...
The majority of the island's area of is occupied by a plateau generally
ranging in altitude between around and .

Should I update the Markdown (and corresponding commit hash)? There are at least 2 questions referring to those numbers.

Besides the missing height and area measurements, the Markdown does not contain the Coordinates: 74°N 90°W from the Wikipedia page. One question refers to that.

Originally posted by @ckadner in #1154 (comment)

@ckadner
Copy link
Collaborator Author

ckadner commented Jun 4, 2024

similar to Prince Leopold Island, there are some key numerical values missing from the Markdown, like the miles per hour figures below


https://en.wikipedia.org/wiki/DMC_DeLorean#Performance

DMC's comparison literature noted that the DeLorean could achieve 0–60 miles per hour (0–97 km/h) in 8.8 seconds when equipped with a manual transmission, but other sources indicate an acceleration time of 9.5 seconds. When equipped with a manual transmission, the DeLorean accelerated from 0 to 60 miles per hour (0 to 97 km/h) in 10.5 seconds as tested by Road & Track magazine. Top speed was claimed at 130 mph (209 km/h), but was tested to be a lacking 110 mph (177 km/h). Car & Driver put the top speed of a 5-speed manual at 117 mph (188 km/h) (indicated)

https://github.com/juliadenham/Summit_knowledge/blob/main/DMC_DeLorean.md#performance

DMC's comparison literature noted that the DeLorean could achieve in 8.8 seconds when equipped with a manual transmission, but other sources indicate an acceleration time of 9.5 seconds. When equipped with a manual transmission, the DeLorean accelerated from in 10.5 seconds as tested by Road & Track magazine. Top speed was claimed at , but was tested to be a lacking . Car & Driver put the top speed of a 5-speed manual at (indicated).

===

https://en.wikipedia.org/wiki/DMC_DeLorean#Base_price

Upon release in 1981, a DeLorean had a base MSRP of $25,000, or equivalent to $84,000 in 2023. MSRP increased in 1982 to $29,825,[57] equivalent to $94,000 in 2023, and again in 1983 to $34,000,[58] equivalent to $104,000 in 2023.

https://github.com/juliadenham/Summit_knowledge/blob/main/DMC_DeLorean.md#base-price

Upon release in 1981, a DeLorean had a base MSRP of $25,000, or . MSRP increased in 1982 to $29,825, , and again in 1983 to $34,000, .

@ckadner
Copy link
Collaborator Author

ckadner commented Jun 4, 2024

We could do either of the following to address the problems:

  • update the Markdown files manually, individually
  • remove all questions that are referring to missing numbers
  • figure out and update the tools used to generate the Markdown (if we plan to keep using Wikipedia for knowledge submissions)

We should definitely ...

  • identify and update in-flight PRs that have this issue
  • find out what tools are used to ingest the data from WikiPedia and curate it for the LLM training and re-use that

I do have a rough script to ...

  • downloads all source PRs
  • combines the QnA
  • removes duplicates
  • find similar questions
  • pull together the co-authors
  • draft the commit message

I am thinking of hacking up some scripts to ...

  • get the source of the Wiki page, run it through a conversion tool and show a diff
  • compare the answers from the qna.yaml file with the wikitext to identify what answers might not have made it through the conversion process

@ckadner
Copy link
Collaborator Author

ckadner commented Jun 5, 2024

Looking at the Wikitext source for the DMC_DeLorean page shows some wiki markup that might throw of the conversion:

{{convert|0|–|60|mph|}} in 8.8 seconds

https://en.wikipedia.org/wiki/Special:Export/DMC_DeLorean

==Performance==
DMC's comparison literature noted that the DeLorean could achieve {{convert|0|–|60|mph|}} in 8.8 seconds when equipped with a manual transmission,<ref>{{harvnb|Parnham|Withers|2014|p=293}}.</ref> but other sources indicate an acceleration time of 9.5 seconds.<ref name=topspeeddelorean/> When equipped with a manual transmission, the DeLorean accelerated from {{convert|0|to|60|mph|}} in 10.5 seconds as tested by ''Road & Track'' magazine. Top speed was claimed at {{cvt|130|mph|km/h|0}}, but was tested to be a lacking {{cvt|110|mph|km/h|0}}.<ref name=topspeeddelorean/><ref name=roadandtrackdelorean/> ''[[Car & Driver]]'' put the top speed of a 5-speed manual at {{cvt|117|mph|km/h|0}} (indicated).<ref name=CarScoopsDMC/>

The car was described as "not quick for a sports/GT car in this price category" by ''Road & Track''.<ref name="roadandtrackdelorean">{{cite news |url=http://www.roadandtrack.com/car-culture/classic-cars/reviews/a27099/1982-delorean-dmc-12-road-test/ |title=1982 DeLorean DMC-12: The Vintage Road & Track Test |first=John |last=Lamm |date=October 21, 2015 |work=Road & Track |access-date=July 23, 2017 |archive-url=https://web.archive.org/web/20170803211950/http://www.roadandtrack.com/car-culture/classic-cars/reviews/a27099/1982-delorean-dmc-12-road-test/ |archive-date=August 3, 2017 |url-status=dead}}<!--Url-status is set to dead because the site will not load the Pictures in some regions or devices, And the R&T Test Results are only shown in a Picture!--></ref>

Links for download, and the downloaded file:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant