Closed as duplicate of#545
Description
The Archives resource and its attendant functionality is unclear to me on several levels:
- I do not know where its endpoint is being called from
- I do not fully understand under what conditions a data source is to be archived
- The logic for determining what is archived seems awkward, and I'm not sure why a simple flag of "archived" in each row would not suffice:
def archives_get_results(conn: PgConnection) -> list[tuple[Any, ...]]:
"""
Pulls data sources for the automatic archives script that performs caching
:param conn: A psycopg2 connection object to a PostgreSQL database.
:return: A list of dictionaries representing the rows matching the query conditions.
"""
cursor = conn.cursor()
sql_query = """
SELECT
airtable_uid,
source_url,
update_frequency,
last_cached,
broken_source_url_as_of
FROM
data_sources
WHERE
(last_cached IS NULL OR update_frequency IS NOT NULL) AND broken_source_url_as_of IS NULL AND url_status <> 'broken' AND source_url IS NOT NULL
"""
cursor.execute(sql_query)
return cursor.fetchall()
I'd like to figure out the answer to these questions, and possibly refactor the Archives resource for improved clarity, or update the documentation.
related to Police-Data-Accessibility-Project/automatic-archives#21