Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add workflow basis to fetch HEBIS Data RPB-225 #105

Merged
merged 23 commits into from
Apr 8, 2025

Conversation

TobiasNx
Copy link
Contributor

@TobiasNx TobiasNx commented Dec 2, 2024

Reuse lobid transformation for Hebis Mapping

  • reused almost all of the lobid transformation. Not used stuff is overhead but it makes it more easy to adjust.
  • I put all maps static and test in the maps folder.

TODO:

  • adjust to HEBIS and rpb specific stuff
  • simple holdings are provided in 924, they are now included.
  • Check: Do they provide rpb specific infos via sru? It seems not
  • Introduce bash script that creates the sru query and reuses them in the flux
  • This PR is depending on #129

@TobiasNx
Copy link
Contributor Author

TobiasNx commented Jan 17, 2025

@fsteeg and @acka47 have a look at my transformed data from hebis including holdings from 924. This could be your basis for the strapi import.

@TobiasNx TobiasNx requested a review from acka47 February 26, 2025 10:16
@TobiasNx TobiasNx changed the title Add first workflow to fetch HEBIS Data RPB-225 Add workflow basis to fetch HEBIS Data RPB-225 Feb 26, 2025
@TobiasNx TobiasNx marked this pull request as ready for review February 26, 2025 10:16
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@acka47 have a look at the files named conf/output/test-hebis-to-lobid-output-n.json

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@acka47 files named conf/hebisMarc2lobid-transformation/comparisonRpbRecords/rpb-hebis-records-n.json are for only for comparison they are records from the productive rpb.

@acka47 acka47 assigned acka47 and unassigned fsteeg Feb 26, 2025
@acka47
Copy link
Contributor

acka47 commented Feb 28, 2025

I tested the files against the JSON schema from https://github.com/hbz/lobid-resources/blob/master/src/test/resources/schemas/resource.json Overall, this looks good but there are apparently two subjects in the test files that do not get sufficient information in the resulting JSON. (I will add additional comments as pointers.) Here is the validation output:

$ sh validateJsonTestFiles.sh 
Testing version: draft
alma-fix/test-hebis-to-lobid-output-0.json passed test
alma-fix/test-hebis-to-lobid-output-1.json passed test
alma-fix/test-hebis-to-lobid-output-10.json failed test
[
  {
    instancePath: '/subject/2',
    schemaPath: 'complexSubject.json/required',
    keyword: 'required',
    params: { missingProperty: 'label' },
    message: "must have required property 'label'"
  },
  {
    instancePath: '/subject/2',
    schemaPath: 'complexSubject.json/required',
    keyword: 'required',
    params: { missingProperty: 'componentList' },
    message: "must have required property 'componentList'"
  },
  {
    instancePath: '/subject/2/type/0',
    schemaPath: 'complexSubject.json/properties/type/items/const',
    keyword: 'const',
    params: { allowedValue: 'ComplexSubject' },
    message: 'must be equal to constant'
  },
  {
    instancePath: '/subject/2',
    schemaPath: 'skosConcept.json/required',
    keyword: 'required',
    params: { missingProperty: 'source' },
    message: "must have required property 'source'"
  },
  {
    instancePath: '/subject/2',
    schemaPath: '#/properties/subject/items/anyOf',
    keyword: 'anyOf',
    params: {},
    message: 'must match a schema in anyOf'
  }
]
alma-fix/test-hebis-to-lobid-output-11.json failed test
[
  {
    instancePath: '/subject/2',
    schemaPath: 'complexSubject.json/required',
    keyword: 'required',
    params: { missingProperty: 'label' },
    message: "must have required property 'label'"
  },
  {
    instancePath: '/subject/2',
    schemaPath: 'complexSubject.json/required',
    keyword: 'required',
    params: { missingProperty: 'componentList' },
    message: "must have required property 'componentList'"
  },
  {
    instancePath: '/subject/2/type/0',
    schemaPath: 'complexSubject.json/properties/type/items/const',
    keyword: 'const',
    params: { allowedValue: 'ComplexSubject' },
    message: 'must be equal to constant'
  },
  {
    instancePath: '/subject/2',
    schemaPath: 'skosConcept.json/required',
    keyword: 'required',
    params: { missingProperty: 'source' },
    message: "must have required property 'source'"
  },
  {
    instancePath: '/subject/2',
    schemaPath: '#/properties/subject/items/anyOf',
    keyword: 'anyOf',
    params: {},
    message: 'must match a schema in anyOf'
  }
]
alma-fix/test-hebis-to-lobid-output-12.json passed test
alma-fix/test-hebis-to-lobid-output-13.json passed test
error:  /home/acka47/git/lobid-resources/src/test/resources/alma-fix/test-hebis-to-lobid-output-14.json: Unexpected end of JSON input
-e Test FAILED

We might have to add a mapping there or to ignore these subjects.

However, it would probably make sense to set up an automatic test that fetches the schema from the lobid-resources repo an validates the files against it, wouldn' it?

"label" : "Systematik der DNB (bis 2003)",
"id" : "https://bartoc.org/en/node/18497"
}
}, {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This subject entry is to spars wrt the lobid-resources JSON schema, see #105 (comment)

"label" : "Systematik der DNB (bis 2003)",
"id" : "https://bartoc.org/en/node/18497"
}
}, {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This subject entry also is to sparse wrt the lobid-resources JSON schema, see #105 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same situation as above.

@acka47 acka47 assigned TobiasNx and unassigned acka47 Feb 28, 2025
@TobiasNx TobiasNx force-pushed the rpb-225-fetchAndTransformHebis branch from 702074f to f73b85b Compare February 28, 2025 13:14
Comment on lines +23 to +26
| batch-reset(batchsize="1")
| encode-json(prettyPrinting="true")
| write(FLUX_DIR + "output/test-hebis-to-lobid-output-${i}.json")
;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the productive workflow @fsteeg you would need to adjust the process here.

"type" : [ "Item", "PhysicalObject" ],
"callNumber" : "Mog m 7188"
}, {
"heldBy" : {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The structure here seems to be wrong: heldBy is in a separate object, but should be in the object above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed this.

To merge external hebis data with strapi record for indexing
@fsteeg
Copy link
Member

fsteeg commented Apr 1, 2025

Great, this works in principle, see comment in Jira.

Assigning:

@fsteeg fsteeg assigned dr0i and TobiasNx and unassigned fsteeg Apr 1, 2025
@TobiasNx TobiasNx requested a review from fsteeg April 1, 2025 15:53
@dr0i dr0i removed their assignment Apr 3, 2025
@dr0i
Copy link
Member

dr0i commented Apr 3, 2025

I added a comment in 4a8c98f , but cannot see it ... has GitHUb changed its behaviour, can you see my comment?

@fsteeg
Copy link
Member

fsteeg commented Apr 3, 2025

I added a comment in 4a8c98f , but cannot see it ... has GitHUb changed its behaviour, can you see my comment?

I don't see it. Maybe you added it to a review, but didn't submit the review?

@fsteeg
Copy link
Member

fsteeg commented Apr 3, 2025

I don't see it. Maybe you added it to a review, but didn't submit the review?

Found the comment and replied here:

4a8c98f#r154800490

@fsteeg fsteeg requested review from dr0i and removed request for acka47 April 3, 2025 09:12
@fsteeg
Copy link
Member

fsteeg commented Apr 3, 2025

Now under functional review by LBZ in https://jira.hbz-nrw.de/browse/RPB-225.

@fsteeg fsteeg unassigned dr0i and TobiasNx Apr 3, 2025
@fsteeg fsteeg requested a review from acka47 April 3, 2025 15:53
@fsteeg
Copy link
Member

fsteeg commented Apr 3, 2025

Re-requested review from @acka47 to make sure requested changes are addressed.

Copy link
Contributor

@acka47 acka47 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@fsteeg fsteeg merged commit 9074c38 into main Apr 8, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants