feature/: Exit Withdraw Date code to handle inclusivity/exclusivity #201
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description & motivation
If you ask Edfi if the enrollment exit_withdraw_date should considered inclusive (this is the last day of the enrollment and the student attended on that day) or exclusive (the last day the student attended for their enrollment was the day prior) then they will tell you "treat it however it makes sense for your implementation". Dandy. But EDU assumes that this date is inclusive. Here at TDOE, this is the ONLY date that we consider exclusive. So lots of the models, and what not, are not accurate. In particular, the "is_active_enrollment" in fct_student_school_association. But also, many of the attendance models would also be inaccurate for us. The changes presented incorporate a new variable that can be set,
edu:enroll:exit_withdraw_date_inclusive, and any models that use exit_withdraw_date will now utilize a few new macros to evaluate exit_withdraw_date from an inclusive or exclusive manner, however configured. The default for this variable is that exit_withdraw_date is true, so the output of EDU shouldn't change for people who came before TDOE.Breaking changes introduced by this PR:
There should be no breaking changes. However, two caveats:
tests/integrity_tests/all_student_school_associations_have_calendars.sql. There is a comparison between the first day of school and the enrollment exit_withdraw_date. But everywhere in EDU, it treats exit_withdraw_date as inclusive and uses <= instead of < when comparing dates. Unless I'm missing something, and, admittedly, I might be, this code should have fct_stu_school_assoc.exit_withdraw_date >= first_school_day.first_school_day instead of what is currently there. If the comparator was >= then this would include students who only attended the first day of school and then exited. (I think).snowflake__day_count_in_range. There is a modelfct_student_school_attendance_eventthat would calculate the number of days for an enrollment. In Databricks, for an inclusive end date, this calculated int would be wrong. Take a 2 day enrollment for example, Nov 1 to Nov 2. Supposing Nov 2 is inclusive in this example, the expected value is 2, but the code was producing a 1. I am assuming Snowflake works the same, so I am adding a +1 to the Snowflake calculation in order to account for this. I don't have Snowflake, so I could be wrong and this will need fixing. I am not sure. However, in any event, the value calculated isn't even used in the output of the model, so it may not matter that it might have been off by 1 day. In any event, there may be dragons here.All other changes should be completely transparent since the default behavior of the enhancement favors the current EDU implementation.
PR Merge Priority:
I would prefer to set this to Medium because we want to get off of our own forks of edu_wh and we won't be able to until certain other things are merged into edu_wh main. However, there is another change we have in the queue that deals with adding CDS (and an enhanced impl of CDS at that) to basically every dim and fact. So I left this as Low.
Changes to existing files:
models/core_warehouse/fct_student_daily_attendance.sql: Changed a case statement to use the new macro for determing if a date counts within the enrollment end date.models/core_warehouse/fct_student_school_association.sql: Modified all the comparisons to exit_withdraw_date to use the new macros.models/core_warehouse/fct_student_school_attendance_event.sql: Modified all the comparisons to exit_withdraw_date to use the new macros. This includes a join on dim_calendar_date.models/qc/sections_per_enrollment.sql: Modified all the comparisons to exit_withdraw_date for the overlap logic with section dates.tests/integrity_tests/all_student_school_associations_have_calendars.sql: Modified the comparison to exit_withdraw_date for the students who exited before the first day of school. The comment here is confusing. But also, I think there's a bug in the original code. See above section on stuff that might be broken.tests/value_tests/invalid_enrollments.sql: Modified the comparisons to exit_withdraw_date for the columns of if they exited before their first day or before entry.tests/value_tests/overlapping_enrollments.sql: Modified the join clause to not use between and use the macro instead of the second date normally in the between.New files created:
macros/end_date_utils.sql: Contains the macros necessary to correctly evaluate dates inclusivity/exclusivity.Tests and QC done:
I tested only the databricks side of this since I do not have Snowflake. Things appear to be working as expected.
edu_wh PR Review Checklist:
Make sure the following have been completed before approving this PR:
## New Featuresfor features, etc.