Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parameter_override_option: handle numeric strings #950

Open
bertsky opened this issue Nov 10, 2022 · 2 comments
Open

parameter_override_option: handle numeric strings #950

bertsky opened this issue Nov 10, 2022 · 2 comments
Assignees

Comments

@bertsky
Copy link
Collaborator

bertsky commented Nov 10, 2022

Suppose I want to have:

        "textequiv-index": {
          "type": "string",
          "enum": ["first", "last", "0", "1", "2", "3", "4", "5", "6", "7", "8", "9"],
          "default": "first",
          "description": "Only extract lines with the specified TextEquiv/@index entries; 'first' and 'last' denote the first and last TextEquiv elements, regardless of their @index, respectively."
        }

Okay, let's run with ocrd-segment-extract-lines -P textequiv-index 0 then.

This is what happens:

Exception: Invalid parameters ["[textequiv-index] 0 is not of type 'string'", "[textequiv-index] 0 is not one of ['first', 'last', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9']"]

This is why:

def set_json_key_value_overrides(obj, *kvpairs):
for kv in kvpairs:
k, v = kv
try:
obj[k] = json.loads(v)
except json.decoder.JSONDecodeError:
obj[k] = v
return obj

Apparently, the parser decodes the 0 as number, simply because it can. It does not bother to look into the tool json's type.

Is that really what we want?

(Workaround on the user side, of course, is: -P textequiv-index '"0"', but this is ugly and surprising. Surprising, because I can write -P textequiv-index first without the quotes.)

@kba
Copy link
Member

kba commented Nov 14, 2022

Is that really what we want?

It's a tradeoff, it simplifies string values (no need for quoting) but does require quoting for strings that might be parsed as numbers or booleans.

-P textequiv-index '"0"'

Is obviously not ideal, to say the least.

simply because it can. It does not bother to look into the tool json's type.

If the type is string, then we don't need to (and should not) try to parse with JSON. That is probably the best solution here, right?

@bertsky
Copy link
Collaborator Author

bertsky commented Nov 23, 2022

If the type is string, then we don't need to (and should not) try to parse with JSON. That is probably the best solution here, right?

Yes, I think so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants