Dumping ground for scripts that I've used for various repository management & metadata processing tasks at work -- mostly for Internet Archive (archive.org) and migrating out of an old local version of CONTENTdm.
-
Updated
Nov 16, 2022 - Python
Dumping ground for scripts that I've used for various repository management & metadata processing tasks at work -- mostly for Internet Archive (archive.org) and migrating out of an old local version of CONTENTdm.
A social sharing modal component using lit-element
D-Archive.org: A Python script for downloading books from the Internet Archive, offering PDF or JPEG output options via CLI.
A simple, user-friendly archive of the now-unavailable website, recreated for easy local access and community use.
Fetches Host data from IA
ThreatConnect playbook checking if a URL has been archived in the wayback machine.
Download a collection of .mp3 files from Internet Archive site (http://archive.org) and create an audiobook in .m4b format
Poetry Identification Code from my dissertation runs on zip files containing DJVUXML from the Internet Archive.
🏛️ Archive all pages specified in the webpage's sitemap to Internet Archive's Wayback Machine
Tool to audit coverartarchive data
Simple Internet Archive book reader for old(er) browsers
🏛 Microservice that redirects to archived version of the URL if found, otherwise saves it to the Internet Archive
WIP, to be updated later.
Set the tile page for an Internet Archive text item
Submit URLs listed inside a file to website archival services
Pulls birth/death dates from Wikipedia and generates lifepan-limited searches for several media search engines.
Crawls websites and saves found URLs to a file.
FeedVault is an open-source web application that allows users to archive and search their favorite web feeds.
Send records from an EPrints server to the Internet Archive and other web archives
Add a description, image, and links to the internet-archive topic page so that developers can more easily learn about it.
To associate your repository with the internet-archive topic, visit your repo's landing page and select "manage topics."