Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: I'd like to be able to instruct my OAI client to return harvested metadata records unparsed, as Strings #284

Open
landreev opened this issue Feb 6, 2025 · 1 comment · May be fixed by #285

Comments

@landreev
Copy link
Collaborator

landreev commented Feb 6, 2025

I'm going to make a PR with a working model implementation. With the assumption that it'll need to be reworked before it gets accepted.

landreev added a commit that referenced this issue Feb 6, 2025
…ucted to return harvested metadata records unparsed #284
landreev added a commit that referenced this issue Feb 6, 2025
landreev added a commit that referenced this issue Feb 7, 2025
@poikilotherm
Copy link
Member

poikilotherm commented Feb 27, 2025

@landreev to understand better what you are trying to achieve in order to properly review the PR: can we cook up an example that better explains what you want to do?

Let's say we have this as part of an response when harvesting:

<ListRecords>
  <record>
    <header>
      [...]
    </header>
    <metadata>
     <rfc1807 xmlns="http://info.internet.isi.edu:80/in-notes/rfc/files/rfc1807.txt" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://info.internet.isi.edu:80/in-notes/rfc/files/rfc1807.txt http://www.openarchives.org/OAI/1.1/rfc1807.xsd">
        <bib-version>v2</bib-version>
        <id>hep-th/9901001</id>
     </rfc1807>
    </metadata>
    <about>
      [...]
    </about>
  </record>
 </ListRecords>

Is your intent to quickly get an unparsed, raw string of the <metadata> element? With the example above, that would be:

<rfc1807 xmlns="http://info.internet.isi.edu:80/in-notes/rfc/files/rfc1807.txt" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://info.internet.isi.edu:80/in-notes/rfc/files/rfc1807.txt http://www.openarchives.org/OAI/1.1/rfc1807.xsd">
  <bib-version>v2</bib-version>
  <id>hep-th/9901001</id>
</rfc1807>

Is that correct? Assuming yes: what are you doing after this with the XML metadata? Would it help to parse it some other way or return a more specific structure than a String?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants