Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repository metadata #22

Closed
sckott opened this issue Mar 5, 2024 · 6 comments
Closed

Repository metadata #22

sckott opened this issue Mar 5, 2024 · 6 comments
Assignees

Comments

@sckott
Copy link
Member

sckott commented Mar 5, 2024

WILDS needs consistent metadata across repos within WILDS to be able to improve discovery for users and ourselves via a) github itself, and b) the getwilds website.

At least for me it's important to be able to add/remove/edit/update metadata programmatically. We may not have a lot of repos now, but when we do we will regret using tooling that's manual only

How to metadata?

  • GitHub topics. These are the keywords that are editable on the right hand side of a repo page.
    • proposal: each major category has a topic and will be included in the topics for each repo. the idea behind the wilds prefix to the topics is that topics can be anything, so e.g. a repo could formally be a Dockerfile repo, but have some R code in it - someone may apply a topic like r or rstats, etc. to that repo in which case we wouldn't know whether this was really a docker repo or an R repo. So repos can have any topics they like but need to have some exact topics we control so that we can easily identify the different types with code instead of humans
      • wilds-r: for an R package
      • wilds-py: for a Python package
      • wilds-docker: for a Dockerfile, or set of files
      • wilds-comp: for a Research Compendia
      • wilds-wdl: for a WDL workflow
      • wilds-nf: for a Nextflow workflow
    • topics can be changed via the gh cli tool using gh repo edit with flags --add-topic/--remove-topic (docs)
    • topics can be changed via the /repos/{owner}/{repo}/topics route with the github api (docs)
  • Badges. WILDS repos are required to have badges in their README
    • proposal: Scrape the badges from the README's of each repo
    • right now there is only required badge for status of the repo. as others become required we could scrape those as well. In addition, there's optional badges that could be scraped if we find their information useful, such as a badge related to CRAN or PyPi, or build status, or code coverage, etc.
    • we could use the R package codemetar, in particular extract_badges - but there's similar tooling in other languages https://codemeta.github.io/tools/

(optional - but I think is kinda the best way to package up metadata across repos)

  • Registry? A machine readable file with all the metadata across all WILDS repos.
    • something like the ropensci registry
    • run the code to generate the registry on a cron schedule so it's constantly being updated
    • creates a machine readable file - available simply via gh-pages of a repo as static json file
@seankross
Copy link

seankross commented Mar 7, 2024

I agree with this proposal, with a couple of clarifications:

  • I want to get rid of "translational analytics analysis" as a category for what's in WILDS and replace it with "Research Compendia." Let's use wilds-comp as the topic tag. @monicagerber are you okay with this?
  • Is your vision for the registry that it's automatically maintained?
  • Let's also add wilds-nf for nextflow.

@sckott
Copy link
Member Author

sckott commented Mar 8, 2024

  • I'll wait for your ok monica on wilds-comp.
  • yes, automatically updated on a cron schedule plus triggered by certain repo events so we have up to date data for important things
  • I'll add wilds-nf

@monicagerber
Copy link
Contributor

Agree!

@sckott
Copy link
Member Author

sckott commented Mar 16, 2024

Okay, registry repo is started https://github.com/getwilds/registry - super basic for now

@sckott
Copy link
Member Author

sckott commented Mar 18, 2024

@seankross let me know any thoughts on the registry repo above when you get a chance

@sckott
Copy link
Member Author

sckott commented Oct 21, 2024

seems done

@sckott sckott closed this as completed Oct 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants