-
Notifications
You must be signed in to change notification settings - Fork 0
TIMX 410 - add TIMDEX provenance to Opensearch mapping #360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
4fb6e47
to
8364fa3
Compare
Pull Request Test Coverage Report for Build 12932961282Details
💛 - Coveralls |
@ehanson8, @jonavellecuerdo - if interested, I've confirmed this works in Dev1 as well. Here is the document for accessing Opensearch dashboards in Dev1: https://mitlibraries.atlassian.net/wiki/spaces/D/pages/3665854480/How+to+access+OpenSearch+Dashboards+in+AWS. Once open and to the querying screen, this works:
Here is an example provenance object: "timdex_provenance": {
"source": "libguides",
"run_date": "2025-01-23",
"run_id": "e758d6c4-6ee4-4862-a00f-b9da4d3758ad",
"run_record_offset": 0
} Note the This is beginning to fully close the loop here:
|
Why these changes are being introduced: With Transmogrifier beginning to write a "timdex_provenance" section to TIMDEX records, an update is needed for TIM to include this during index creation and writing. How this addresses that need: Updates Opensearch mapping to include new "timdex_provenance" field. Additionally, sample records were update to include provenance field values, and casettes were re-recorded for successful indexing of records with those provenance values. Side effects of this change: * Support for records with provenance sections Relevant ticket(s): * https://mitlibraries.atlassian.net/browse/TIMX-406 * https://mitlibraries.atlassian.net/browse/TIMX-410
227eecf
to
e5c3197
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, and here's to rabbit holes only leading to minimal changes!
Purpose and background context
This PR updates the Opensearch mapping to support the new
timdex_provenance
field.The mapping update itself was quite minimal, following the pattern of a nested structure of fields.
Testing was a bit of a rabbit hole, that I eventually crawled back out of with minimal changes.
As noted in a previous PR, the testing suite for TIM relies heavily on VCR casettes for recording interactions with Opensearch. Interestingly, there is relatively light per-field testing of the mapping (e.g. there are not dedicated tests for field
foo
where we attempt to index valuebar
), and perhaps that is okay.Short of a dedicated test to see if this
timdex_provenance
mapping is matching the actual JSON values we expect in the transformed record, the VCR casette for testtest_create_index_success
has been re-recorded with updated sample records that contain this new field. The successful creation of records suggests that the mapping is aligned with the new values intests/fixtures/sample_records.json
.Additionally, a minor update was made to
Makefile
for managing a local Opensearch instance per this commit.How can a reviewer manually see the effects of these changes?
1- Run Opensearch locally
2- Set Dev1 credentials in terminal
3- Give Opensearch 20-30 seconds to start... then create an index and bulk update from a test dataset in S3
4- Navigate to http://localhost:5601/app/dev_tools#/console and perform this query
Note the
timdex_provenance
sections in the records returned, example:Includes new or updated dependencies?
NO
Changes expectations for external applications?
YES: Opensearch documents will now contain
timdex_provenance
fields if Transmogrifier has included them during transformationWhat are the relevant tickets?
Developer
Code Reviewer(s)