-
Notifications
You must be signed in to change notification settings - Fork 75
Rule 2ee8b8 ("Visible label is part of accessible name"): introducing a new "label in name algorithm". #2075
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
…l in name algorithm". It's intended mostly to handle whitespace and punctuation.
@dan-tripp-siteimprove Since this is being worked on still by @kengdoj, can we set this to draft? |
Done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good. I like the details and the many new examples that explicit the decisions we've taken.
pages/glossary/visible-inner-text.md
Outdated
|
||
The <dfn id="for-text">visible inner text of a [text node][]</dfn> is: | ||
- if the [text node][] is [visible][], its visible inner text is its [data][]; | ||
- if the [text node][] is not-[visible][], [rendered][], and contains only [whitespace][], its visible inner text is the string `" "` (a single ASCII whitespace); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conditional here sounds a bit weird 🤔
Notably, a text node that is not visible, rendered, and contains more than whitespace (e.g. in <span style="visibility: hidden">Hello</span>
) would not trigger it and therefore have an empty string as visible inner text (rather than a whitespace).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting question. I don't know the answer. But I'll note that I copied this definition from sanshikan so if it needs fixing here, it probably needs fixing there too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, doing some archaeology, this is due to the fact that whitespace are not visible per our definition…
<button aria-label="hello world"><span>hello</span><span id="space"> </span><span>world</span></button>
The span#space
is not visible (and neither is its child text node). So the first bullet doesn't apply. Without the second bullet, the visible inner text of the button would be helloworld
, not matching the accessible name of hello world
due to spacing…
I guess we need to add an example to show that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in b2df021
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This raises another question: what should we do with this?
<a aria-label="Download specification" href="#"><span>Download</span><span style="visibility: hidden">x</span><span>specification</span></a>
According to the current definition, because of the clause "contains only [whitespace][]", the visible inner text of the <a> element is "Downloadspecification". Visually it looks like "Download specification". So I wonder if we could remove the clause "contains only [whitespace][]". What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point 🤔 But if the span
was invisible due to absolute positioning out of viewport, it shrould be removed:
<a aria-label="Download specification" href="#"><span>Download</span><span style="position: absolute; left: -9999px">x</span><span>specification</span></a>
I guess the true condition is whether it creates a CSS box that lies somewhere between the ones of the rest of the text taking part in the computation (and isn't fully contained in them), or something like that 🙈
Or maybe we just make the special case for visibility: hidden
and assume that these is already a corner case and that it won't create too many true problems (We've been using that definition in Alfa for two years and I don't remember seeing a problem caused by it, so it may be safe to assume that it is a good enough approximation).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has given me a lot to think about. I'll try to bring it up in our next one-on-one meeting.
…://github.com/Siteimprove/sanshikan/blob/main/terms/visible-inner-text.md) - changing glossary links' prefixes from "./" to "#". I don't know if the former was working or not. but the latter is the common practice, it seems.
Co-authored-by: Jean-Yves Moyen <[email protected]>
Co-authored-by: Jean-Yves Moyen <[email protected]>
…placing it with a new idea: the algorithm 'return value' eg. 'returns "is contained"'. - rewording rule expectation. I think that 'For the target element' is better than 'For each target element' because for this rule, the computation of the expecation for each applicable target element is done in isolation from the other applicable targets on the page. It's simpler if the "for loop" over all applicable targets is done by the tool, not the rule.
…s algorithm is for.
Thanks, Dan.
Those are good examples, I was wondering if the visible text from a label element is included in this rule, for example,
<label for=’abc’>Recording </label><button id=’abc’> </button>
<label>Submit<input type=’submit’ /> </label>
This can get a little more complicated if multiple items involved, for example, <label for=’abc’>Recording </label><button id=’abc’ aria-label=’recording the document’>Do it</button>
Please just ignore if this case is included in another rule.
Regards,
Shunguo Yan, Ph.D.
AI, Mobile, Web, Accessibility Technology & Innovation
From: Dan Tripp ***@***.***>
Date: Thursday, January 23, 2025 at 5:27 PM
To: act-rules/act-rules.github.io ***@***.***>
Cc: Shunguo Yan ***@***.***>, Mention ***@***.***>
Subject: [EXTERNAL] Re: [act-rules/act-rules.github.io] Rule 2ee8b8 ("Visible label is part of accessible name"): introducing a new "label in name algorithm". (PR #2075)
@ shunguoy in the examples for this PR, there are buttons with inner text in passed example 4, passed example 7, failed example 10 and failed example 11 - are these the kind of examples you're interested in? — Reply to this email directly,
@shunguoy<https://github.com/shunguoy > in the examples for this PR, there are buttons with inner text in passed example 4<https://github.com/act-rules/act-rules.github.io/pull/2075/files#diff-b609e901c9aca49d8a0ada57a17d4d3905a41f9c1b6030e8f03dc40db4d9dec8L90-L96 >, passed example 7<https://github.com/act-rules/act-rules.github.io/pull/2075/files#diff-b609e901c9aca49d8a0ada57a17d4d3905a41f9c1b6030e8f03dc40db4d9dec8R130-R136 >, failed example 10<https://github.com/act-rules/act-rules.github.io/pull/2075/files#diff-b609e901c9aca49d8a0ada57a17d4d3905a41f9c1b6030e8f03dc40db4d9dec8R300-R306 > and failed example 11<https://github.com/act-rules/act-rules.github.io/pull/2075/files#diff-b609e901c9aca49d8a0ada57a17d4d3905a41f9c1b6030e8f03dc40db4d9dec8R308-R314 > - are these the kind of examples you're interested in?
—
Reply to this email directly, view it on GitHub<#2075 (comment) >, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AENOH2WDEB7XL2ZXDMN72ST2MF3DXAVCNFSM6AAAAAAZPPNL5KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMJRGIYTQNBYGY >.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Thank you for clarifying. I hadn't thought of <label> cases before, and I don't know of any <label> cases - as they relate to the 'label in name' SC - being covered by other ACT rules. This rule is interested in two pieces of information: the visible label and the accessible name. Your cases seem to raise questions about the visible label i.e. whether this rule should consider <label> element contents to be part of the visible label. Your cases seem not to raise any questions about the accessible name. For the accessible name, we at ACT use the accessible name computation algorithm, which is defined by another group of people, and which seems to work fine. The complications we have seem to all be regarding the "visible label" part of this. Am I right? The 'label in name' SC mentions <label> only briefly, and not in a way that helps us, as far as I can tell. If anything, it suggests that we need to be concerned with <label> cases, and much more: ""label" is not used in such a programmatic sense but is simply referring to a text string in close visual proximity to a component". This is a worthwhile issue, but probably not something that this PR can fix. If you would like to collaborate with me on a future PR, let me know. |
@shunguoy This is a good point indeed. I think the rule tried to deflect the problem by restricting its Applicability to roles that accept name from content (thus, e.g., dodging The rule actually requires all of (i) name from content; (ii) (non-empty) visible inner text; and (iii) As @dan-tripp-siteimprove says, this is unrelated to the current PR, but definitely worth discussing, can you open an issue so we don't forget about it? |
will do. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I think the comment for improving upon this work (concerns around parenthesis meaning different things within different contexts) is valid and should be addressed, I agree that it may be best to take it up as an improvement.
Aside from this, I believe the number of examples to be appropriate given the complexity of the rule, and everything else seems correct to my understanding. Approving!
__tests__/spelling-ignore.yml
Outdated
@@ -135,7 +135,7 @@ | |||
- ozplayer | |||
- GitHub | |||
|
|||
# Test case anamolies | |||
# Test case anomolies |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Test case anomolies | |
# Test case anomalies |
|
||
This rule assumes that the visible label doesn't use CSS to add whitespace where none exists in the DOM. | ||
|
||
This rule assumes that for any word which appears in both the accessible name and the visible label, the same spelling and hyphenation is used in both places. For example: if "non-negative" is used in the accessible name and "nonnegative" is used in the visible label, that would violate this assumption. Or if "color" is used in the accessible name and "colour" is used in the visible label, that would also violate this assumption. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This rule assumes that for any word which appears in both the accessible name and the visible label, the same spelling and hyphenation is used in both places. For example: if "non-negative" is used in the accessible name and "nonnegative" is used in the visible label, that would violate this assumption. Or if "color" is used in the accessible name and "colour" is used in the visible label, that would also violate this assumption. | |
This rule assumes that for any word which appears in both the accessible name and the visible label, the same spelling and hyphenation is used in both places. For example, if "non-negative" is used in the accessible name and "nonnegative" is used in the visible label, that would violate this assumption. Similarly, if "color" is used in the accessible name and "colour" is used in the visible label, that would also violate this assumption. |
- For each character that either a) represents non-text content, or b) isn't a letter or a digit: replace that character with a space character. | ||
- For a) Judgment of "non-text" probably can't be fully automated. For example: "X" for "close" probably can be automated, but presumably there are more cases than this. | ||
- For b) Use the Unicode classes Letter, Mark, and "Number, Decimal Digit [Nd]". (This will exclude hyphens, punctuation, emoji, and more.) | ||
- Remove all characters that are within parentheses (AKA round brackets). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that we know this is an issue, I think it should be included some how as a note that it may result in false negatives in these situations. Particularly since we are already recognising i18n differences in the whitespace splitting part of this algorithm.
- It checks whether elements are consecutive or not. That is: it checks for a substring, in the computer science sense of the term. Not a subsequence. | ||
- An empty list is a sublist of any list. | ||
|
||
If the answer is "yes" (that is: the tokenized 'label' is a sublist of the tokenized 'name'), then this algorithm returns "is contained". Otherwise, it returns "is not contained". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Daft question time. What if the label is 'This is a cat' and the name is 'This is not a cat'. The tokenized name would include the tokenized label but is obviously entirely incorrect.
I would be happy to be missing something in the specific definitions of 'is contained' and 'sublist'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@iadawn when you wrote "these situations", which situation do you mean? Is it (a) (Judgment of "non-text")? If so: I think that would result in a false positive (i.e. incorrect failing of the rule), not a false negative.
As for the 'This is a cat' case: I tried to cover that with the bullet point that starts with "It checks whether elements are consecutive or not."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'These situations' relates to the situation where removal of parenthesis results in a change in meaning for certain languages. You note that these will result in a false negative but it might be worth being clear about that possibility in the text somehow.
Regarding the cat and rereading it I think that the problem may start with the statement:
is the tokenized 'label' a sublist of the tokenized 'name'
That introduces the term 'sublist' - which has it's own particular meaning - and then defines that term in a different way to it's own particular meaning.
Perhaps another way to write this would be:
If the answer is "yes" (that is: the tokenized 'label' is a sublist of the tokenized 'name'), then this algorithm returns "is contained". Otherwise, it returns "is not contained". | |
Then check if the consecutive tokenized 'label' is contained within the tokenized 'name'. |
Or some such.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'These situations' relates to the situation where removal of parenthesis results in a change in meaning for certain languages.
@iadawn I see now. Thank you for clarifying. I just pushed two commits for it. Does that cover it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That introduces the term 'sublist' - which has it's own particular meaning - and then defines that term in a different way to it's own particular meaning.
You replaced "sublist" with "contained within". In all respect, I think that's a step backwards. "Contained within" is ambiguous, as was revealed by #1458 , which this PR (in it's current state i.e. using my definition of "sublist") fixes.
That introduces the term 'sublist' - which has it's own particular meaning - and then defines that term in a different way to it's own particular meaning.
I'm afraid I don't understand either part of that of that sentence. 1) Assuming that by "particular meaning" you mean "official definition": I couldn't find any official definition of "sublist", so I made my own. 2) Even if "sublist" had an official definition somewhere, I assume that official definition would define it to work the same as every programming language's "subList()" function, which means: consecutive elements, same order. Which is the same way that my definition works. So how my definition differs from that is beyond me.
✅ Deploy Preview for act-rules ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
I've resolved conflicts with the 'develop' branch. Some of them were tricky. Would appreciate if you can take a look to see if I messed up with something here. |
22ede5a
to
4da5300
Compare
|
||
Let 'label' be the [visible inner text][] of the target element. Let 'name' be the [accessible name][] of the target element. Both 'label' and 'name' are strings. | ||
|
||
Sub-algorithm to tokenize a string: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since these steps are sequential, suggest they be numbered instead of bullet points.
Nice, thank you. I hadn't thought to do that. I will take a look - but it might take me a few days. |
@daniel-montalvo I looked today and I found no problems. Thank you. |
|
||
This rule assumes that the visible label doesn't use CSS to add whitespace where none exists in the DOM. | ||
|
||
This rule assumes that for any word which appears in both the accessible name and the visible label, the same spelling and hyphenation is used in both places. For example: if "non-negative" is used in the accessible name and "nonnegative" is used in the visible label, that would violate this assumption. Similarly, if "color" is used in the accessible name and "colour" is used in the visible label, that would also violate this assumption. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the presence or absence of a hyphen is exempt under the 'punctuation optional' wording in the "Punctuation and capitalization" section of the Understanding document. Speech recognition users would rarely say "non hyphen negative" or "non dash negative". Most would just say "nonnegative"; my personal experience with speech recognition is that it would generative a match in either case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree - and I think that's why it's appropriate here, under "Assumptions". ACT assumptions is the home for things that, if broken, can result in a case that fails the rule and passes the SC, which is what this is, am I right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm saying I think your assumption is too contained in this case. It's relatively easy to ignore hyphens, so I think a better assumption would be that hyphens, as punctuation, are ignored. I agree that misspellings are more problematic. Short of a dictionary of allowed spelling variances, I think that you starting assumption for spelling is supportable.
|
||
This rule assumes that for any word which appears in both the accessible name and the visible label, the same spelling and hyphenation is used in both places. For example: if "non-negative" is used in the accessible name and "nonnegative" is used in the visible label, that would violate this assumption. Similarly, if "color" is used in the accessible name and "colour" is used in the visible label, that would also violate this assumption. | ||
|
||
This rule - specifically, the [label in name algorithm][] that this rule relies on - assumes that the algorithm's treatment of parentheses is appropriate in the given human language. "Parentheses" are also known as "round brackets". The algorithm's treatment of parentheses is to remove them and all characters within them. This assumption can be reworded as: content within parentheses can be ignored. This assumption is almost always true in English. It is known to be often false in other languages, such as German (where parentheses indicate dual states) and Arabic (where parentheses are often used as quotation marks). Violations of this assumption will, in real-world scenarios, more often result in a false negative for this rule rather than a false positive. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, if all the information in parentheses is stripped out of the test, wouldn't a resulting match for the remaining string still be valid, regardless of language?
i.e., If I strip out the parenthetical material in "This is my label (XXXX)" from both the label and name strings, shouldn't the resulting name still contain the resulting string of the label?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds correct to me. Does that cause any problems?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest the algorithm has a first step for text normalization before any comparison.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds correct to me. Does that cause any problems?
I agree that for variable considerations like Label in Name, it makes sense to bias tests towards false negatives (that is, to err on the side of flagging fewer potential issues ).
``` | ||
|
||
#### Failed Example 5 | ||
|
||
This link has [visible][] text does not match the [accessible name][] because there are extra spaces in the accessible name. | ||
This link has an [accessible name][] which contains a hyphen. The [label in name algorithm][] breaks up words on hyphens. So it turns "non-standard" into two tokens: "non" and "standard". So this fails the rule. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The [label in name algorithm][] breaks up words on hyphens
That seems somewhat problematic, especially when considered from a speech input perspective, where aurally there is no meaningful difference between "non standard", "non-standard", and "nonstandard".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, and it's unclear to me what the ideal fix for this would be, but I think this is a problem which - as I've said for other problems - "this PR didn't create, and didn't make worse" - am I right? If so, then I suggest - as I did with similar problems raised on this PR - that we deal with it later. In the spirit of incremental improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the algorithm does make it worse by breaking a hyphenated word into two words. If the algorithm simply removed the hyphen (ignored it) rather than replacing it with a space, it would raise less false positives. I'm not saying it would be ideal, but less problematic.
|
||
#### Failed Example 7 | ||
|
||
The rule has no special handling for abbreviations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rule has no special handling for abbreviations.
This seems problematic. For many abbreviations it is much more natural (in English) to say the full word when encountering an abbreviations (for example, abbreviations in an address like "Rd.", "St.", "Pl.", etc).
Perhaps it is just another shortcoming of this SC, but the calculation seems to me to need to be more fuzzy, and bias towards passing, not failing an abbreviation construct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I agree, but I consider this problem to be in the category of problems that this PR didn't create and didn't make worse. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess two points.
The SC wording is: "the name contains the text that is presented visually."
We already have a rule stating that punctuation can be disregarded, so in this example, the displayed wording can be interpreted as "University Ave" and the name is "University Avenue". The accessible name literally contains the text that is presented visually (plus three more characters). So why is it a fail? It should be a pass, no?
The second point is just about abbreviations as synonyms. I can think of situations in English where it would be unusual to say only the abbreviation. For instance, if someone sees "Main St." or "First Rd.", they are going to say the abbreviation as the full word ("main street", "first road"). On the other hand "Ave" may be spoken as the full word or as the abbreviation. Part of me believes it is reasonable to compile a set of abbreviations which are commonly spoken as the full word, and not the abbreviation. But that seems unsustainable/unscalable.
On consideration, I reluctantly agree that failing a mismatch where the name does not contain all the characters of the displayed abbreviation (less punctuation) makes sense.
Co-authored-by: Mike Gower <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the computation of the label does not take into account cases where there is additional, clearly separated information following the label text that may be embedded in the same element, but should probably not be used when evaluating whether name contains the label string. Example:
<label>First name (required)</label><input type="text" aria-label="First name">
This would formally fail, but it probably shouldn't.
The broader problem is that the purpose of the SC (to allow predictable and workable speech input) is practically undone in cases where longish label text is included in the accessible name in order to meet 2.5.3. When using voice control on mobile phones, speaking the long label text to activate the element will then often not work. So, a formal PASS, but a practical FAIL.
Not sure how to include this in the algorithm though.
I'd like to thank everyone for the comments in recent weeks. Unfortunately I have been too occupied with other work to respond to them. I hope to have time in late May. |
<< Describe the changes >>
Closes issue(s):
Need for Call for Review:
This will require a 2 weeks Call for Review
Pull Request Etiquette
When creating PR:
develop
branch (left side).After creating PR:
Rule
,Definition
orChore
.When merging a PR:
How to Review And Approve