-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Status Audit Management Command #667
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,98 @@ | ||
import json | ||
import logging | ||
|
||
from django.core.management.base import BaseCommand | ||
from tqdm import tqdm | ||
|
||
from alert.models import Course, Section | ||
from courses import registrar | ||
from courses.util import get_course_and_section, get_current_semester, translate_semester_inv | ||
|
||
|
||
class Command(BaseCommand): | ||
help = """ | ||
Generate an audit report that demonstrates the differences between our | ||
database and the course statuses received from using the OpenData | ||
endpoint directly. | ||
Note that this script DOES NOT make any changes to the database, just | ||
generates a textfile report | ||
""" | ||
|
||
def handle(self, *args, **options): | ||
root_logger = logging.getLogger("") | ||
root_logger.setLevel(logging.DEBUG) | ||
|
||
semester = get_current_semester() | ||
statuses = registrar.get_all_course_status(semester) | ||
stats = { | ||
"missing_data": 0, | ||
"section_not_found": 0, | ||
"duplicate_updates": 0, | ||
"unsynced_updates": 0, | ||
} | ||
unsynced_courses = [] | ||
for status in tqdm(statuses): | ||
data = status | ||
section_code = data.get("section_id_normalized") | ||
if section_code is None: | ||
stats["missing_data"] += 1 | ||
continue | ||
|
||
course_status = data.get("status") | ||
if course_status is None: | ||
stats["missing_data"] += 1 | ||
continue | ||
|
||
course_term = data.get("term") | ||
if course_term is None: | ||
stats["missing_data"] += 1 | ||
continue | ||
if any(course_term.endswith(s) for s in ["10", "20", "30"]): | ||
course_term = translate_semester_inv(course_term) | ||
|
||
# Ignore sections not in db | ||
try: | ||
_, section = get_course_and_section(section_code, semester) | ||
except (Section.DoesNotExist, Course.DoesNotExist): | ||
stats["section_not_found"] += 1 | ||
continue | ||
|
||
# Ignore duplicate updates | ||
last_status_update = section.last_status_update | ||
current_status = section.status | ||
if current_status == course_status: | ||
stats["duplicate_updates"] += 1 | ||
continue | ||
|
||
stats["unsynced_updates"] += 1 | ||
unsynced_courses.append( | ||
(section_code, last_status_update.new_status, current_status, course_status) | ||
) | ||
|
||
# Write out statistics and missing courses to an output file. | ||
with open("./status_audit.txt", "w") as f: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. if we want to run this on a cron, we may want to allow the file name to be passed in/to include the UNIX timestamp. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yup sounds good. Do you think we should write it to S3 or something like that? Maybe it'd be cool if we could setup a Slack integration and get notifs that way, but might be a lot of extra work for little reward. |
||
f.write("Summary Statistics\n") | ||
f.write(json.dumps(stats) + "\n\n") | ||
|
||
f.write( | ||
"""Courses Out of Sync\nCourse Code / Last Update Status / | ||
Our Stored Status / Actual Status\n""" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What does actual status mean? Might be good to elaborate bc I'm getting it mixed up in my head There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actual status is the Path@Penn (or the accurate), status of a course. I'll rename it a bit and also add a comment explaining in-line. |
||
) | ||
f.write("Our Status Matches Last Update\n") | ||
f.writelines( | ||
[ | ||
f"{course[0]} / {course[1]} / {course[2]} / {course[3]}\n" | ||
for course in unsynced_courses | ||
if course[1] == course[2] | ||
] | ||
) | ||
|
||
f.write("\nOur Status Does Not Match Last Update\n") | ||
f.writelines( | ||
[ | ||
f"{course[0]} / {course[1]} / {course[2]} / {course[3]}\n" | ||
for course in unsynced_courses | ||
if course[1] != course[2] | ||
] | ||
) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
Summary Statistics | ||
{"missing_data": 0, "section_not_found": 0, "duplicate_updates": 5711, "unsynced_updates": 49} | ||
|
||
Courses Out of Sync | ||
Course Code / Last Update Status /Our Stored Status / Actual Status | ||
Our Status Matches Last Update | ||
ANTH-2550-401 / C / C / O | ||
BE-2000-201 / O / O / C | ||
BE-5650-001 / C / C / O | ||
BEPP-2020-402 / C / C / O | ||
BEPP-2030-001 / C / C / O | ||
CHEM-2412-142 / C / C / O | ||
CHEM-2510-220 / O / O / C | ||
COMM-6030-401 / O / O / C | ||
EDUC-5256-001 / C / C / O | ||
ESE-2040-001 / O / O / X | ||
ESE-2040-201 / O / O / X | ||
ESE-2180-001 / C / C / O | ||
ESE-2180-101 / C / C / O | ||
FNAR-1070-402 / C / C / O | ||
FNAR-5011-402 / C / C / O | ||
FNCE-2020-402 / C / C / O | ||
GRMN-1800-001 / O / O / C | ||
HSOC-2002-403 / C / C / O | ||
LGST-2190-001 / O / O / C | ||
LGST-2920-402 / O / O / C | ||
MGMT-2920-402 / O / O / C | ||
MGMT-3010-001 / C / C / O | ||
MGMT-3010-005 / C / C / O | ||
NURS-1640-110 / C / C / O | ||
OIDD-2920-402 / O / O / C | ||
PSCI-1800-204 / C / C / O | ||
PSYC-1777-001 / C / C / O | ||
SAST-2550-401 / C / C / O | ||
SOCI-2000-403 / C / C / O | ||
SPAN-1800-303 / C / C / O | ||
SSPP-6030-401 / O / O / C | ||
SWRK-6020-001 / C / C / O | ||
SWRK-6030-003 / C / C / O | ||
SWRK-7770-001 / O / O / C | ||
|
||
Our Status Does Not Match Last Update | ||
ANTH-0120-404 / C / O / C | ||
BIOL-1101-102 / O / C / O | ||
CHEM-1102-185 / O / C / O | ||
CIS-1210-214 / C / O / C | ||
EALC-0730-406 / C / O / C | ||
HIST-0550-406 / C / O / C | ||
LALS-4240-401 / C / O / C | ||
MEAM-2100-202 / O / C / O | ||
MEAM-2470-101 / O / C / O | ||
MSSP-6340-001 / C / O / C | ||
PHIL-1380-301 / C / O / C | ||
PSCI-0200-202 / O / C / O | ||
SOCI-2910-404 / C / O / C | ||
SOCI-2931-401 / C / O / C | ||
SWRK-6020-005 / O / C / O |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this status per section or per course?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be
section_status
, will change.