Several fixups for unicode normalization #16584

LeonarddeR · 2024-05-21T11:09:33Z

Link to issue number:

Fixup of #16521
Fixes #11570
Partial fix for #4631

Summary of the issue:

It turns out that rawTextTypeforms on a region may be None, this was an oversight on my end.
cursorPos may also be None.
@burmancomp reported a zero division error in case a string ended with a non breaking space and a space.

Description of user facing changes

No longer errors in the log when getting flash messages in Thunderbird and/or reading messages in WhatsApp UWP.

Description of development approach

Explicitly check for None typeforms and cursorPos, thereby improving readability as well.
Improve the calculateOffsets method in textUtils to ensure it can handle the case as reported by @burmancomp

Testing strategy:

From a python console
braille.TextRegion("ĳ").update()
No longer results in an error.
Same for braille.TextRegion("\xa0 ").update()

Known issues with pull request:

None known

Code Review Checklist:

Documentation:
- Change log entry
- User Documentation
- Developer / Technical Documentation
- Context sensitive help for GUI changes
Testing:
- Unit tests
- System (end to end) tests
- Manual testing
UX of all users considered:
- Speech
- Braille
- Low Vision
- Different web browsers
- Localization in other languages / culture than English
API is compatible with existing add-ons.
Security precautions taken.

AppVeyorBot · 2024-05-21T13:19:46Z

Build execution time has reached the maximum allowed time for your plan (60 minutes).

See test results for failed build of commit 8147d2e76b

AppVeyorBot · 2024-05-21T14:40:56Z

Build execution time has reached the maximum allowed time for your plan (60 minutes).

See test results for failed build of commit 740467ce53

AppVeyorBot · 2024-05-22T04:54:59Z

Build execution time has reached the maximum allowed time for your plan (60 minutes).

See test results for failed build of commit b9ed18e76b

LeonarddeR · 2024-05-22T12:31:25Z

@burmancomp reported a zero division error when doing:
textUtils.UnicodeNormalizationOffsetConverter("removed original text\xa0 ")

AppVeyorBot · 2024-05-22T15:57:32Z

Build execution time has reached the maximum allowed time for your plan (60 minutes).

See test results for failed build of commit e687938887

LeonarddeR · 2024-05-25T06:53:40Z

I think for now, we could probably best leave this as is. Note that there are still some open questions, but these can be handled in a follow up:

Whether unicode normalization should be disabled by default (I think it still should),. That said, we should be able to change the default without breaking explicit configurations, hence the feature flag with combo box approach.
Whether character navigation should also normalize. It currently doesn't, and I think there's a real good reason for it. An option to consider is to report the normalized character preceded by the word normalized, but in other languages (e.g. in Dutch) this will probably be very weird and confusing. Other options hook into the character descriptions mechanism.

seanbudd

Thanks @LeonarddeR

user_docs/en/userGuide.md

user_docs/en/changes.md

Qchristensen

Looks good

seanbudd · 2024-05-27T00:31:15Z

@LeonarddeR - are these 2 things tracked in a separate issue or discussion? I think they'd be good to discuss for 2024.4

LeonarddeR requested a review from a team as a code owner May 21, 2024 11:09

LeonarddeR requested a review from seanbudd May 21, 2024 11:09

LeonarddeR changed the title ~~Respect cases where no typeforms are provided~~ Braille unicode normalization: Respect cases where no typeforms or cursor position provided May 21, 2024

seanbudd approved these changes May 22, 2024

View reviewed changes

LeonarddeR marked this pull request as draft May 22, 2024 12:29

LeonarddeR mentioned this pull request May 22, 2024

Add Unicode Normalization to speech and braille #16521

Merged

5 tasks

LeonarddeR changed the title ~~Braille unicode normalization: Respect cases where no typeforms or cursor position provided~~ Several fixups for unicode normalization May 22, 2024

LeonarddeR mentioned this pull request May 23, 2024

Add optional unicode normalization before passing strings to speech or braille #16466

Closed

LeonarddeR and others added 5 commits May 23, 2024 14:25

Respect cases where no typeforms are provided.

da5babf

Also fix cursor pos

3b0a871

Fix zero division error

650aa29

Mention nvaccess#11570 in changelog

14c3d83

Mention equations

5f3f0f8

LeonarddeR force-pushed the fixupNormalizationBraille branch from 3185531 to 5f3f0f8 Compare May 23, 2024 12:30

LeonarddeR marked this pull request as ready for review May 25, 2024 06:49

LeonarddeR requested a review from a team as a code owner May 25, 2024 06:49

LeonarddeR requested a review from Qchristensen May 25, 2024 06:49

seanbudd approved these changes May 26, 2024

View reviewed changes

user_docs/en/userGuide.md Outdated Show resolved Hide resolved

user_docs/en/changes.md Outdated Show resolved Hide resolved

Apply suggestions from code review

cf58f86

Qchristensen approved these changes May 27, 2024

View reviewed changes

seanbudd merged commit 00f5ea2 into nvaccess:master May 27, 2024
1 check was pending

seanbudd mentioned this pull request May 29, 2024

Normalization of unicode cahracter: allow excluding the symbols in the symbols.dic file from the normalization #16624

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Several fixups for unicode normalization #16584

Several fixups for unicode normalization #16584

LeonarddeR commented May 21, 2024 •

edited

AppVeyorBot commented May 21, 2024

AppVeyorBot commented May 21, 2024

AppVeyorBot commented May 22, 2024

LeonarddeR commented May 22, 2024

AppVeyorBot commented May 22, 2024

LeonarddeR commented May 25, 2024

seanbudd left a comment

Qchristensen left a comment

seanbudd commented May 27, 2024

Several fixups for unicode normalization #16584

Several fixups for unicode normalization #16584

Conversation

LeonarddeR commented May 21, 2024 • edited

Link to issue number:

Summary of the issue:

Description of user facing changes

Description of development approach

Testing strategy:

Known issues with pull request:

Code Review Checklist:

AppVeyorBot commented May 21, 2024

AppVeyorBot commented May 21, 2024

AppVeyorBot commented May 22, 2024

LeonarddeR commented May 22, 2024

AppVeyorBot commented May 22, 2024

LeonarddeR commented May 25, 2024

seanbudd left a comment

Choose a reason for hiding this comment

Qchristensen left a comment

Choose a reason for hiding this comment

seanbudd commented May 27, 2024

LeonarddeR commented May 21, 2024 •

edited