Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand the dataset to include a wider variety of urls and labels #30

Open
EvilDrPurple opened this issue Jan 18, 2024 · 3 comments
Open
Labels
annotation enhancement New feature or request

Comments

@EvilDrPurple
Copy link
Contributor

EvilDrPurple commented Jan 18, 2024

Context

A wider variety of URLs and the record type labels associated with them are needed to facilitate Hugging Face model training.

Requirements

  • More URLs need to be manually labeled, especially those associated with under-utilized labels.
  • let's shoot for 100 of each type

Under-utilized labels

  • Accident Reports
  • Booking Reports
  • Budgets & Finances
  • Car GPS
  • Citations
  • Dispatch Recordings
  • Field Contacts
  • Inmate Records
  • List of Data Sources
  • Sex Offender Registry
  • Stops
  • Use of Force Reports
  • Vehicle Pursuits
  • Wanted Persons

Related

#63
Police-Data-Accessibility-Project/data-projects#8
Police-Data-Accessibility-Project/data-projects#9

@josh-chamberlain josh-chamberlain added enhancement New feature or request annotation labels Jan 18, 2024
@maxachis
Copy link
Collaborator

@josh-chamberlain @EvilDrPurple What's the procedure for obtaining additional URLs as-is? Are there specific sources or websites where these types of records are more likely to be found? Do we have a way of finding them automatically, or is it done manually?

@josh-chamberlain
Copy link
Contributor

I believe I answered here: #16 (comment)

@josh-chamberlain
Copy link
Contributor

Current thought on this is to try creating batches with keywords in #88

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
annotation enhancement New feature or request
Projects
Status: Reference
Development

No branches or pull requests

3 participants