Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review "tags" field in CfA's organizations.json #33

Open
themightychris opened this issue Nov 8, 2019 · 5 comments
Open

Review "tags" field in CfA's organizations.json #33

themightychris opened this issue Nov 8, 2019 · 5 comments

Comments

@themightychris
Copy link
Collaborator

themightychris commented Nov 8, 2019

I just noticed that the organizations.json database CfA maintains (which drives both the index and CfA's internal tooling, and which CfA staff currently handle moderating brigade captain access to change) includes a tags field at the brigade level: https://github.com/codeforamerica/brigade-information

It is documented as such:


  • tags (Required) - An array of descriptors for your group.
    Some commonly used tags are:
    • Brigade
    • Official
    • Code for America - Code for America Brigade
    • Code for America Partner Brigade - A separate 501(c)3 nonprofit that is an official Code for America Brigade.
    • Code for America Fiscally Sponsored Brigade - A Brigade that uses a non-Code for America Fiscal Sponsor.
    • Code for All - Code for All network member ("Governing Partner")
    • Code for All Affiliate - Code for All network Affiliate Partner
    • Fellowship - organization is running a fellowship program
    • Government

They appear to be applied pretty consistently in the data: https://github.com/codeforamerica/brigade-information/blob/master/organizations.json

Hack for LA, for example, has this entry:

        "tags": [
            "Brigade",
            "Code for America",
            "Code for America Fiscally Sponsored Brigade",
            "Official"
        ]

There are however a few odd values like midwest added in without sufficient qualifiers to be a global tag.

Some things this makes me wonder:

  • Do these represent tags that are being used substantially "in the wild" on GitHub already as project topics? Or is their use strictly internal to categorizing things within this list
    • If the former, how could we reconcile this with our own preferred form right now being code-for-america?
    • If use of these as GitHub topics in the wild is sparse, would we just snake-case all the tags in here and use it as a mapping in the index. It seems like many of them wouldn't be things we'd want tagged on github projects
  • Should we add another field to organizations.json for "synonymous tags", so that at the same time a brigade declares that "hack-for-la" is their official projects tag, they might also expressly document that "code-for-america" and "civic-hacking" are some separate recommended_project_tags? Or should we just whitelist some of the existing tags in organizations.json like "Code for All" and "Code for America" and infer that code-for-all and code-for-america should be applied too to all projects tagged to that brigade?
@ckingbailey
Copy link
Collaborator

* Do these represent tags that are being used substantially "in the wild" on GitHub already as project topics? Or is their use strictly internal to categorizing things within this list

Did some quick research on this. I didn't find anything for:

  • code-for-america + fellowship
  • code-for-all
  • codeforall
  • code_for_all
  • code-for-america-partner-brigade

It doesn't appear these tags are being used widely in the wild. I think @tdooner could speak to this more accurately, but I think the purpose of these tags is for querying CfAPI, as in /api/organizations?tags[]=Code%20for%20America&tags[]=Brigade

  * If use of these as GitHub topics in the wild is sparse, would we just snake-case all the tags in here and use it as a mapping in the index. It seems like many of them wouldn't be things we'd want tagged on github projects

snake_case, or kebab-case? I'd vote for kebab-case to be consistent with our GH topic tagging recommendations, unless there's a reason the index prefers underscores to hyphens. I'd rather not have to remember that it's snake_case here, kebab-case there, and camelCase somewhere else.

* Should we add another field to organizations.json for "synonymous tags", so that at the same time a brigade declares that "hack-for-la" is their official projects tag, they might also expressly document that "code-for-america" and "civic-hacking" are some separate `recommended_project_tags`? Or should we just whitelist some of the existing tags in organizations.json like "Code for All" and "Code for America" and infer that `code-for-all` and `code-for-america` should be applied too to all projects tagged to that brigade?

I like the idea of whitelisting an official version for the ones that will be used across many brigades, and leaving them out of organizations.json. I'd still like to see a field on organizations.json to store a brigade's official topic tag, like openoakland.

@tdooner
Copy link
Contributor

tdooner commented Nov 8, 2019

@ckingbailey Yes, the tags are meant for filtering of the list of Brigades. As far as I know, the only usage in the wild is the Brigade website:

https://github.com/codeforamerica/brigade/blob/master/cfapi/__init__.py#L84

I don't see a benefit to trying to represent these tags on Github repos, since they're different. The tags we're talking about for the project index are for tagging projects, whereas the tags in the CFAPI are for the brigades themselves.

@themightychris
Copy link
Collaborator Author

themightychris commented Nov 8, 2019

thanks for clearing that up @tdooner, these tags would be accurate at least for determining which organizations can be rolled up under code-for-all and code-for-america, right?

I guess what I'm really seeking to figure out is, as we start collecting the brigade tag for projects (e.g. projects_tag: "openoakland"), what other information from organizations.json would be accurate/helpful for projects to assume by relation? If this tags field isn't rigorously maintained I could see the answer being "nothing", but if it IS rigorously maintained... I think it would make sense at least for example for a search in the index for all projects matching code-for-america to also include all projects tagged openoakland because the Open Oakland entry in organizations.json is tagged "Code for America".

I could see it being useful for the crawler to apply these "inferred" tags while building the index. That is, tools using the index wouldn't need to know to include openoakland whenever someone wants to filter by code-for-america, because the index might just have the code-for-america or code-for-all tags automatically inserted already for you

@ckingbailey thanks for surveying their use, sounds like we don't need to consider projects already using them so the question is just what can we infer from them?

  1. it seems like code-for-all / code-for-america could be inferred to projects based on these tags + projects_tag
  2. it seems like region tags (I saw midwest in there a couple times) weren't rigorously maintained and aren't usable
  3. it seems like "Brigade", "Government", and "Official" are rigorously maintained. Are there tags these should surface as at the project level or are they just interesting bits/filters for the orgs list?

@tdooner
Copy link
Contributor

tdooner commented Nov 10, 2019

these tags would be accurate at least for determining which organizations can be rolled up under code-for-all and code-for-america, right?

I only maintain the tags for brigades tagged with "Code for America". The tags I maintain are: "Code for America Fiscally Sponsored Brigade", "Code for America Partner Brigade", and "Official". I don't maintain "Government".

what other information from organizations.json would be accurate/helpful for projects to assume by relation

The "Official" tag is relevant because I will remove it when Brigades are no longer part of the network. We probably want to de-list their projects from the index at that time. Other than that, idk, just the other metadata fields about the brigade could be useful in how we display the project (e.g. "Click here to go to the brigade's website!" or something like that). But that can probably be handled at a layer other than the index data layer.

it seems like region tags (I saw midwest in there a couple times) weren't rigorously maintained and aren't usable

Yeah, some people put those in there, and I didn't have the heart to remove them.

@tdooner
Copy link
Contributor

tdooner commented Nov 10, 2019

I don't know how much you care about these tags @themightychris, but, you can also see how I maintain these tags. This is the script I have which reconciles the organizations.json file with Salesforce:

https://github.com/codeforamerica/brigade-information/blob/master/bin/merge-from-salesforce#L201-L212

I run this script once a week whenever there are changes to the brigade list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Prioritized Backlog
Development

No branches or pull requests

4 participants