Skip to content

Clarify/Refactor Archives Logic #309

Closed as duplicate of#545
Closed as duplicate of#545
@maxachis

Description

@maxachis

The Archives resource and its attendant functionality is unclear to me on several levels:

  1. I do not know where its endpoint is being called from
  2. I do not fully understand under what conditions a data source is to be archived
  3. The logic for determining what is archived seems awkward, and I'm not sure why a simple flag of "archived" in each row would not suffice:
def archives_get_results(conn: PgConnection) -> list[tuple[Any, ...]]:
    """
    Pulls data sources for the automatic archives script that performs caching

    :param conn: A psycopg2 connection object to a PostgreSQL database.
    :return: A list of dictionaries representing the rows matching the query conditions.
    """
    cursor = conn.cursor()
    sql_query = """
    SELECT
        airtable_uid,
        source_url,
        update_frequency,
        last_cached,
        broken_source_url_as_of
    FROM
        data_sources
    WHERE 
        (last_cached IS NULL OR update_frequency IS NOT NULL) AND broken_source_url_as_of IS NULL AND url_status <> 'broken' AND source_url IS NOT NULL
    """
    cursor.execute(sql_query)

    return cursor.fetchall()

I'd like to figure out the answer to these questions, and possibly refactor the Archives resource for improved clarity, or update the documentation.

related to Police-Data-Accessibility-Project/automatic-archives#21

Metadata

Metadata

Assignees

Labels

documentationImprovements or additions to documentationpost-v1For after sunsetting v1refactor 🔄️Improve the code without changing the underlying logic

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions