Skip to content

Consolidate 'finalize' result summaries into a single file #177

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 20, 2025

Conversation

jonavellecuerdo
Copy link
Contributor

@jonavellecuerdo jonavellecuerdo commented Jun 16, 2025

Purpose and background context

Simplify generated report via 'finalize' by keeping all DSS results for a batch in a single file

Note: DO NOT MERGE UNTIL PR #176 is approved and merged! These changes build on the work in PR #176.

How can a reviewer manually see the effects of these changes?

A test run using a small sample of OpenCourseWare files was performed. The screenshot and file attachment shared below are from the email sent on June 16, 2025 at 10:24AM EST with subject heading: DSpace Submission Results - opencourseware, batch='opencourseware-2025-04-22-stepfunctiontest'.

  1. Review screenshot of email sent via finalize command
image
  1. Review file attachment: dss_submission_results.csv

Includes new or updated dependencies?

NO

Changes expectations for external applications?

NO

What are the relevant tickets?

Developer

  • All new ENV is documented in README
  • All new ENV has been added to staging and production environments
  • All related Jira tickets are linked in commit message(s)
  • Stakeholder approval has been confirmed (or is not needed)

Code Reviewer(s)

  • The commit message is clear and follows our guidelines (not just this PR message)
  • There are appropriate tests covering any new functionality
  • The provided documentation is sufficient for understanding any new functionality introduced
  • Any manual tests have been performed or provided examples verified
  • New dependencies are appropriate or there were no changes

@coveralls
Copy link

coveralls commented Jun 16, 2025

Pull Request Test Coverage Report for Build 15786856642

Details

  • 4 of 4 (100.0%) changed or added relevant lines in 2 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.01%) to 95.72%

Totals Coverage Status
Change from base Build 15780438034: -0.01%
Covered Lines: 738
Relevant Lines: 771

💛 - Coveralls

@jonavellecuerdo jonavellecuerdo force-pushed the IN-1323-improve-methods-for-processing-result-messages branch from dc67866 to 3781203 Compare June 20, 2025 13:37
Base automatically changed from IN-1323-improve-methods-for-processing-result-messages to main June 20, 2025 13:48
Why these changes are being introduced:
* Simplify generated report via 'finalize' by keeping
all DSS results for a batch in a single file

How this addresses that need:
* Update 'finalize' Jinja reporting templates to read all
results from Workflow.events.processed_items
* Create FinalizeReport.create_summary method to create
content of email body
* Fix Docker build and publish commands in Makefile

Side effects of this change:
* The emailed report sent via the 'finalize' command will comprise of
a single CSV file attachment, containing the information from all
parsed DSS result messages.

Relevant ticket(s):
* https://mitlibraries.atlassian.net/browse/IN-1320
@jonavellecuerdo jonavellecuerdo force-pushed the IN-1320-consolidate-results-in-finalize-report branch from 9b5ac47 to 9e6705c Compare June 20, 2025 13:59
@jonavellecuerdo jonavellecuerdo marked this pull request as ready for review June 20, 2025 14:15
@jonavellecuerdo jonavellecuerdo requested a review from a team as a code owner June 20, 2025 14:15
Copy link

@ghukill ghukill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved! Consolidating seems like a no-brainer here.

One comment, totally optional, might be to reorder the columns in the report. Looking at the CSV I see the following column order:

item_identifier,result_message_body,ingested,dspace_handle,error

I think moving the more debugging-esque column result_message_body to the end could be helpful. I might propose an order like:

item_identifier,ingested,dspace_handle,error,result_message_body

But looks good otherewise.

Comment on lines 73 to 87
def to_plain_text(self) -> str:
return self.jinja_template_plain_text.render(
workflow_name=self.workflow_name,
batch_id=self.batch_id,
report_date=self.report_date,
processed_items=self.events.processed_items,
ingested_items=self.get_ingested_items(),
errors=self.events.errors,
summary=self.create_summary(),
)

def to_html(self) -> str:
return self.jinja_template_html.render(
workflow_name=self.workflow_name,
batch_id=self.batch_id,
report_date=self.report_date,
processed_items=self.events.processed_items,
ingested_items=self.get_ingested_items(),
errors=self.events.errors,
summary=self.create_summary(),
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Big +1 to consolidating these.

Copy link
Contributor

@ehanson8 ehanson8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfection! But I agree with Graham's suggested column order, that would be more readable to the stakeholders

@jonavellecuerdo jonavellecuerdo merged commit 4232d65 into main Jun 20, 2025
3 checks passed
@jonavellecuerdo jonavellecuerdo deleted the IN-1320-consolidate-results-in-finalize-report branch June 20, 2025 20:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants