Skip to content

a commandline command (Python3 program) that reads depiction information (images URLs) from given EntityFacts sheets (as line-delimited JSON records) and retrieves the (Wikimedia Commons file) metadata of these pictures (as line-delimited JSON records)

License

Notifications You must be signed in to change notification settings

slub/entityfactspicturesmetadataharvester

Repository files navigation

entityfactspicturesmetadataharvester - EntityFacts pictures metadata harvester

entityfactspicturesmetadataharvester is a commandline command (Python3 program) that reads depiction information (images URLs) from given EntityFacts sheets* (as line-delimited JSON records) and retrieves the (Wikimedia Commons file) metadata of these pictures (as line-delimited JSON records).

*) EntityFacts are "fact sheets" on entities of the Integrated Authority File (GND), which is provided by German National Library (DNB)

Usage

It eats EntityFacts sheets as line-delimited JSON records from stdin.

It puts the (Wikimedia Commons file) metadata of each picture one by one as line-delimited JSON record to stdout.

entityfactspicturesmetadataharvester

optional arguments:
  -h, --help                           show this help message and exit
  • example:
    example: entityfactspicturesmetadataharvester < [INPUT LINE-DELIMITED JSON FILE WITH ENTITYFACTS SHEETS] > [OUTPUT PICTURES METADATA LINE-DELIMITED JSON FILE]
    

Note

The GND identifier from the EntityFacts sheet, where the picture (link) was found, is included into the metadata record (in the result). You can access/find it in the field 'gnd_id'.

Run

  • clone this git repo or just download the entityfactspicturesmetadataharvester.py file
  • run ./entityfactspicturesmetadataharvester.py
  • for a hackish way to use entityfactspicturesmetadataharvester system-wide, copy to /usr/local/bin

Install system-wide via pip

sudo -H pip3 install --upgrade [ABSOLUTE PATH TO YOUR LOCAL GIT REPOSITORY OF ENTITYFACTSPICTURESMETADATAHARVESTER]

(which provides you entityfactspicturesmetadataharvester as a system-wide commandline command)

See Also

  • entityfactssheetsharvester - a commandline command (Python3 program) that retrieves EntityFacts sheets from a given CSV with GND identifiers and returns them as line-delimited JSON records
  • entityfactspicturesharvester - a commandline command (Python3 program) that reads depiction information (images URLs) from given EntityFacts sheets (as line-delimited JSON records) and retrieves and stores the pictures and thumbnails contained in this information

About

a commandline command (Python3 program) that reads depiction information (images URLs) from given EntityFacts sheets (as line-delimited JSON records) and retrieves the (Wikimedia Commons file) metadata of these pictures (as line-delimited JSON records)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages