Skip to content

Conversation

@jayckaiser
Copy link
Contributor

Feature: Persistent Student ID Best Match File

Description & motivation

In the current version of the student_id bundle, if an assessment source file does not hit the REQUIRED_ID_MATCH_RATE threshold, Earthmover fails and so non-matched student output file is generated.

This PR makes the following updates:

  • Always succeed in student ID xwalking, and output the full source file if no ID match hits the required threshold.
  • Output a plain text description of the IDs to be updated alongside the no-matched students file.

PR Merge Priority:

  • Low
  • Medium
  • High

This (and all dependent repo branches) are required for the new round of assessment loading in South Carolina. I can keep these branches pinned for the moment, but eventually they should be merged to main. Note that because we pin branches directly in the earthmover_packages.yaml file in our implementation repos, it is not essential for this dependency to be merged before other repo PRs.

Changes to existing files:

  • packages/student_ids/earthmover.yaml: Incorporate changes outlined above.

New files created:

  • packages/student_ids/best_id_match.txtt: Create a new text template for describing subsequent partner action for resolving non-matched students.
  • packages/student_ids/no_match.csv: A one-line no-match default to ensure non-matched students are not dropped.

Tests and QC done:

This branch has been tested extensively in South Carolina Test and QA.

Future ToDos & Questions:

  • The best_id_match.txtt template may be too verbose for the generic package. Perhaps it should be simplified here and implementation-specific variations should be composed.
  • The no_match.csv file should be replaced with an in-line source, once we incorporate Earthmover PR131.

…//github.com/edanalytics/earthmover_edfi_bundles into feature/student_id_wrapper_require_rows_exit_code__human_readable_student_id_match_file
…one record is always returned in best_id_match.
…ed rate; Turn off require_rows failures to move this logic into the EM DAG.
…g different outputs on successful and unsuccessful runs.
Fix minor spelling error.
Remove match rate from output file (as this only reflects the very first run).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants