Skip to content

CC-2305: Management commands for healing Video file S3<->Postgres relationships #180

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 238 commits into
base: master
Choose a base branch
from

Conversation

jonholdsworth
Copy link

@jonholdsworth jonholdsworth commented Sep 2, 2024

Adds management commands to resolve discrepancies between NZSL Signbank's Postgres database and Amazon S3 file storage.

Jira ticket

CC-2305 NZSL Signbank: UAT database backport

Changes

  • Three management commands created, details below.
  • README documentation created.

These management commands are used to report on relationships between Signbank's Postgres database and Amazon's S3 file storage, and then assist with effecting some types of repair where discrepancies exist. See documentation.

These commands only perform certain actions, other operations have to be manually commanded using AWS cli or other means, using the output from these commands as data. See documentation.

  • The commands use the Boto3 python library to talk to AWS S3.
  • They use an external client to talk to Postgres.
  • They output diagnostic and progress information on STDERR. All data output is on STDOUT and may be safely redirected.

The commands require, usually in the environment:

  • An AWS profile - eg. AWS_PROFILE environment variable set to a pre-configured profile.
  • A Postgres context - eg. DATABASE_URL environment variable with target and credentials.

The commands have some common arguments:

  • --help or -h - emit a Help message showing the available arguments.
  • --env - specifies the target environment, eg. dev, uat, production. This is used to contruct the name of the AWS S3 bucket name, eg. nzsl-signbank-media-uat. The default is uat.
  • --pgcli - allows the user to specify a different path for the Postgres command-line client. The default is /usr/bin/psql.

Commands


  • get-video-s3-acls

This command produces a full report on NZSL vs S3.
It outputs as CSV, with headers.


  • find-fixable-s3-orphans

This command accesses the database and S3 in a similar way to get-video-s3-acls.py.

It finds S3 objects that have no corresponding NZSL Signbank database record. These are 'orphaned' S3 objects.
It then parses the name string of the object and attempts to find an NZSL Signbank record that matches it. This is not guaranteed to be correct, so the output needs human review.
It outputs what it finds as CSV with header, in a format that can be digested by the 3rd command repair-fixable-s3-orphans.py.


  • repair-fixable-s3-orphans

This attempts to unify NZSL Signbank records with S3 orphans, by digesting a CSV input of the same format as output by find-fixable-orphans.py. It does this by generating GlossVideo Django objects where necessary, and associating them with the correct Gloss Django objects. This operation changes the database contents and so must be used with caution.


@jonholdsworth
Copy link
Author

I have made all 3 scripts into Django Management Commands.
They have improved help text as well.

@jonholdsworth
Copy link
Author

This was never merged!

@jonholdsworth jonholdsworth marked this pull request as ready for review March 12, 2025 04:35
@jonholdsworth
Copy link
Author

@G-Rath I believe this can be merged.
This adds the scripts as management commands.
Could you review please.

The github ignores could possibly be removed, or can be addressed separately.

@jonholdsworth jonholdsworth changed the title CC-2305: create script for auditing S3 files and their permissions CC-2305: Mangement commands for healing Video file S3<->Postgres relationships Apr 16, 2025
@jonholdsworth jonholdsworth changed the title CC-2305: Mangement commands for healing Video file S3<->Postgres relationships CC-2305: Management commands for healing Video file S3<->Postgres relationships Apr 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants