Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Several fixups for unicode normalization #16584

Merged
merged 6 commits into from
May 27, 2024

Conversation

LeonarddeR
Copy link
Collaborator

@LeonarddeR LeonarddeR commented May 21, 2024

Link to issue number:

Fixup of #16521
Fixes #11570
Partial fix for #4631

Summary of the issue:

  1. It turns out that rawTextTypeforms on a region may be None, this was an oversight on my end.
  2. cursorPos may also be None.
  3. @burmancomp reported a zero division error in case a string ended with a non breaking space and a space.

Description of user facing changes

No longer errors in the log when getting flash messages in Thunderbird and/or reading messages in WhatsApp UWP.

Description of development approach

  1. Explicitly check for None typeforms and cursorPos, thereby improving readability as well.
  2. Improve the calculateOffsets method in textUtils to ensure it can handle the case as reported by @burmancomp

Testing strategy:

From a python console
braille.TextRegion("ij").update()
No longer results in an error.
Same for braille.TextRegion("\xa0 ").update()

Known issues with pull request:

None known

Code Review Checklist:

  • Documentation:
    • Change log entry
    • User Documentation
    • Developer / Technical Documentation
    • Context sensitive help for GUI changes
  • Testing:
    • Unit tests
    • System (end to end) tests
    • Manual testing
  • UX of all users considered:
    • Speech
    • Braille
    • Low Vision
    • Different web browsers
    • Localization in other languages / culture than English
  • API is compatible with existing add-ons.
  • Security precautions taken.

@LeonarddeR LeonarddeR requested a review from a team as a code owner May 21, 2024 11:09
@LeonarddeR LeonarddeR requested a review from seanbudd May 21, 2024 11:09
@LeonarddeR LeonarddeR changed the title Respect cases where no typeforms are provided Braille unicode normalization: Respect cases where no typeforms or cursor position provided May 21, 2024
@AppVeyorBot
Copy link

  • Build execution time has reached the maximum allowed time for your plan (60 minutes).

See test results for failed build of commit 8147d2e76b

@AppVeyorBot
Copy link

  • Build execution time has reached the maximum allowed time for your plan (60 minutes).

See test results for failed build of commit 740467ce53

@AppVeyorBot
Copy link

  • Build execution time has reached the maximum allowed time for your plan (60 minutes).

See test results for failed build of commit b9ed18e76b

@LeonarddeR LeonarddeR marked this pull request as draft May 22, 2024 12:29
@LeonarddeR
Copy link
Collaborator Author

@burmancomp reported a zero division error when doing:
textUtils.UnicodeNormalizationOffsetConverter("removed original text\xa0 ")

@LeonarddeR LeonarddeR changed the title Braille unicode normalization: Respect cases where no typeforms or cursor position provided Several fixups for unicode normalization May 22, 2024
@AppVeyorBot
Copy link

  • Build execution time has reached the maximum allowed time for your plan (60 minutes).

See test results for failed build of commit e687938887

@LeonarddeR LeonarddeR marked this pull request as ready for review May 25, 2024 06:49
@LeonarddeR LeonarddeR requested a review from a team as a code owner May 25, 2024 06:49
@LeonarddeR
Copy link
Collaborator Author

I think for now, we could probably best leave this as is. Note that there are still some open questions, but these can be handled in a follow up:

  1. Whether unicode normalization should be disabled by default (I think it still should),. That said, we should be able to change the default without breaking explicit configurations, hence the feature flag with combo box approach.
  2. Whether character navigation should also normalize. It currently doesn't, and I think there's a real good reason for it. An option to consider is to report the normalized character preceded by the word normalized, but in other languages (e.g. in Dutch) this will probably be very weird and confusing. Other options hook into the character descriptions mechanism.

Copy link
Member

@seanbudd seanbudd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @LeonarddeR

user_docs/en/userGuide.md Outdated Show resolved Hide resolved
user_docs/en/changes.md Outdated Show resolved Hide resolved
Copy link
Member

@Qchristensen Qchristensen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@seanbudd
Copy link
Member

@LeonarddeR - are these 2 things tracked in a separate issue or discussion? I think they'd be good to discuss for 2024.4

@seanbudd seanbudd merged commit 00f5ea2 into nvaccess:master May 27, 2024
1 check was pending
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support mathematical alphanumeric symbols
4 participants