Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed With Unprocessable Entity #39

Open
tobiasschweizer opened this issue Mar 22, 2024 · 8 comments
Open

Failed With Unprocessable Entity #39

tobiasschweizer opened this issue Mar 22, 2024 · 8 comments

Comments

@tobiasschweizer
Copy link

tobiasschweizer commented Mar 22, 2024

Hi @miku,

I experience a problem with an endpoint that used to work before. I am not sure if this is a problem with the endpoint itself. Maybe you can give me some guidance.

$ metha-sync -v
0.3.0
$ metha-sync -base-dir . -format marcxml -set user-lory_phlu https://zenodo.org/oai2d
INFO[0000] https://zenodo.org/oai2d?verb=Identify       
INFO[0000] harvest: &{BaseURL:https://zenodo.org/oai2d Format:marcxml Set:user-lory_phlu From: Until: Client:0xc0000847f0 MaxRequests:1048576 DisableSelectiveHarvesting:false CleanBeforeDecode:true IgnoreHTTPErrors:false MaxEmptyResponses:10 SuppressFormatParameter:false HourlyInterval:false DailyInterval:false ExtraHeaders:map[] KeepTemporaryFiles:false Delay:0 Identify:0xc00013aec0 Started:0001-01-01 00:00:00 +0000 UTC Mutex:{state:0 sema:0}} 
INFO[0000] https://zenodo.org/oai2d?from=2014-02-03T00:00:00Z&metadataPrefix=marcxml&set=user-lory_phlu&until=2014-02-28T23:59:59Z&verb=ListRecords 
FATA[0000] failed with Unprocessable Entity on https://zenodo.org/oai2d?from=2014-02-03T00:00:00Z&metadataPrefix=marcxml&set=user-lory_phlu&until=2014-02-28T23:59:59Z&verb=ListRecords: <nil> 

Among the returned info from the endpoint, I see "DailyInterval:false". Does this explain the issue?

UPDATE: I see "DailyInterval:false" also with other endpoints that work.

Thanks!

Tobias

@tobiasschweizer
Copy link
Author

using -no-intervals works

@miku
Copy link
Owner

miku commented Mar 27, 2024

Interesting. It seems that your example query does not match any record, and the server chooses to return an HTTP/1.1 422 UNPROCESSABLE ENTITY for that. I added a workaround for the 422 error (although I haven't seen this failure that much in other endpoints, yet).

@tobiasschweizer
Copy link
Author

Yes, it is strange. It used to work some months back which is why I wrote an email to zenodo to find out whether the changed any settings. It is weird that the indicate an Earliest Datestamp 2014-02-03T14:41:33Z and then nothing is returned from that interval.

Thanks for the fix. I'll try it out and get back to you.

@tobiasschweizer
Copy link
Author

I can confirm that with 0.3.2 I am able to harvest without the -no-intervals flag. Thanks for fixing that so quickly!
In case a get answer from zenodo explaining the behaviour, I'll let you know.

@tobiasschweizer
Copy link
Author

tobiasschweizer commented Mar 27, 2024

I have just heard from Zenodo. They logged the issue and will report back with an update once the fixed it.

@miku
Copy link
Owner

miku commented Mar 27, 2024

Thanks for the update. Somehow, HTTP 422 does not seem so wrong, after all:

The HyperText Transfer Protocol (HTTP) 422 Unprocessable Content response status code indicates that the server understands the content type of the request entity, and the syntax of the request entity is correct, but it was unable to process the contained instructions. -- https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/422

The OAI spec is not against using additional HTTP status codes: http://www.openarchives.org/OAI/openarchivesprotocol.html#StatusCodes

OAI-PMH repositories MAY employ HTTP Status-Codes in addition to "200 OK".

If I find other endpoints using HTTP 422, I'll make a note.

@tobiasschweizer
Copy link
Author

Still I don't understand why the earliest timestamp would not return any data.

@miku
Copy link
Owner

miku commented Mar 27, 2024

The missing earliest timestamp issue I saw in other endpoints as well and that should be fixable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants