Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rationale for perceive and perceptor names #23

Open
ruv opened this issue Dec 9, 2024 · 13 comments
Open

Rationale for perceive and perceptor names #23

ruv opened this issue Dec 9, 2024 · 13 comments
Labels
naming Choosing names, the art

Comments

@ruv
Copy link
Collaborator

ruv commented Dec 9, 2024

On terminology and naming

Obviously, new terms (denotations) allow us to express ourselves more concisely, since we replace long phrases with single words. Also, simple terms allow us to introduce compound terms. This applies to both programming and speech.

Obviously, it is always better to create a new compound term from two simple terms rather than from a compound term and a simple term. So, introducing a simple term or simple name for a Forth word allows us to have shorter compound names based on that name.

For example, the name recognize-forth is a compound name. When we use this name in other compound names, such as recognize-forth-sure, recognize-forth-nt, recognize-forth-nt-sure, we get long names with many hyphens.

If we use the simple name perceive instead of the compound name recognize-forth, we will have shorter compound names: perceive-sure, perceive-nt, perceive-nt-sure.

This also allows us to distinguish between general recognizers and the recognizer that the Forth text interpreter currently uses to recognize lexemes (i.e., the perceptor).
For example, we might have two completely different group of recognizers:

  • recognize-nt ( sd.lexeme -- nt tt-nt | 0 ) — find a word according to the search order;
  • recognize-nt-sure ( sd.lexeme -- nt tt-nt ) — find a word according to the search order, throw the error -13 if it is not found;
  • perceive-nt ( sd.lexeme -- nt tt-nt | 0 ) — execute the perceptor, discard its result if the result type is different from ( nt tt-nt ).
  • perceive-nt-sure ( sd.lexeme -- nt tt-nt ) — execute the perceptor, if its result type is 0, throw the error -13, otherwise, if its result type is different from ( nt tt-nt ), throw the error -32 (invalid name argument).

The important thing is that what and how these recognizers recognize is obvious from their names and stack diagrams.


(from the recognizer chat, on 2024-10-08)

@ruv ruv added the naming Choosing names, the art label Dec 9, 2024
@alextangent
Copy link

There are lots of new and potentially confusing words here. Lexeme, perceptor, perceive, ... I believe I have complained about these before.

The first issue is one of existing meaning in English. Find and search are not synonyms, but Forth treats them as such; I find that mildly annoying, once avoidable but now a matter of history. I'll let it be. But the proposed words are more than mildly annoying, and are very avoidable (and not yet a matter of history).

A lexeme is the set of all forms that have the same meaning, while lemma refers to the particular form that is chosen by convention to represent the lexeme. In English, for example, run, runs and running are forms of the same lexeme, but run is the lemma. I do not like the use of lexeme at all, since its natural meaning and its intended meaning are not the same.

Perceptor is a Latin word, and your use of it was the first time I had seen it. In fact, most dictionaries don't admit of the word, and while it's obviously never true that programming languages can only use words you might find in a dictionary, this one is notable for its complete obscurity. I'm only partly serious, but if it's concision and understandability you want, how about KEN? Any Scot or northern English reader will ken that word immediately, it's in the root of similar words in Germanic languages, and it's in the dictionary too.

Perceive is not a synonym of recognize. Yes, they are similar, but they are not the same. I perceive people in a photograph, but I may fail to recognize them.

While I tend to agree with the issue of compound words (especially long compound words) I do think that there are occasions where they are acceptable. But again, I can't agree that adding the postfix -sure to anything is desirable. Sure is an adverb or adjective and what "perceive-nt-sure" (a verb-noun-adverb combination) does isn't obvious from the name.

I can see what you are trying to achieve here. I agree, the choice of words is important. But I really would warn against using obscure or unfamiliar words, or those that have a well defined meaning that doesn't match what you want the reader to understand.

@ruv
Copy link
Collaborator Author

ruv commented Dec 11, 2024

@alextangent, thank you for your feedback! Could you please suggest some naming options for the mentioned words?

@ruv
Copy link
Collaborator Author

ruv commented Dec 11, 2024

Some Wiktionary references:

I see that the verb «to ken» is close in meaning to «to perceive». I would prefer «to perceive» over «to ken» because the former has a counterpart noun «perceptor».

@ruv
Copy link
Collaborator Author

ruv commented Dec 11, 2024

@alextangent wrote:

Find and search are not synonyms, but Forth treats them as such;

I think they are used correctly in Forth terminology.

Definitions/meaning:

  • «to find»
  • «to search»
    • an example: I searched the garden for the keys and found them in the vegetable patch.

In Forth: The system searched the compilation wordlist for the word name and found a word with that name.

The names search-wordlist, find-name-in, find-name follow the usage patterns of these English words.


Sure is an adverb or adjective and what "perceive-nt-sure" (a verb-noun-adverb combination) does isn't obvious from the name.

Agreed, this should be explained.

It is a naming convention that serves for naming a derivative word from a basic word, when the output of the basic word should be usually analyzed with if, and the output of the derivative word is ready to use without analyzing.

For example, a word do-xxx ( -- mytype | 0 ) returns either a value of the type mytype, or zero. So, you have to analyze it's output. A derivative word do-xxx-sure ( -- mytype ) returns a value of mytype, or throw an exception. So, you do not have to analyze it's output for zero.

A more practical example:

synonym parse-lexeme parse-name
: parse-lexeme-sure ( "lexeme" -- sd.lexeme ) parse-lexeme dup if exit then -16 throw ;
: find-name-sure ( sd.lexeme -- nt ) find-name dup if exit then -13 throw ;
: parse-xt ( "lexeme" -- xt ) parse-lexeme-sure find-name-sure name> ;
synonym ' parse-xt

@alextangent
Copy link

@alextangent wrote:

Find and search are not synonyms, but Forth treats them as such;

I think they are used correctly in Forth terminology.

Definitions/meaning:

  • «to find»

  • «to search»

    • an example: I searched the garden for the keys and found them in the vegetable patch.

In Forth: The system searched the compilation wordlist for the word name and found a word with that name.

The names search-wordlist, find-name-in, find-name follow the usage patterns of these English words.

They don't follow correct usage with the exception of search-wordlist. Find presumes the existence of the thing; search does not presume the existence of the thing (which may be found, but may not exist). But as I said, an academic point as it's a matter of history and too late to change.

Sure is an adverb or adjective and what "perceive-nt-sure" (a verb-noun-adverb combination) does isn't obvious from the name.

Agreed, this should be explained.

Then your manifesto of clarity has failed in this case. It should at least be xxx-ENSURE (as ensure is the verb form, then it is a compound verb or verb phrase). FIND-NAME-ENSURE would be better as FIND-NAME-ELSE-THROW as what we're doing is conditionally ensuring it's found or THROWing.

I'm sorry, but I dislike the approach. I can't think of a single good reason for using or formalizing this kind of compound formation, or for requiring the definition of convenience words like this. In your example, it's not as though there's carnal knowledge required to define such a word yourself; it's just whatever DUP 0= IF -13 THROW THEN.

@alextangent
Copy link

I may be described, probably fairly, as a bit of a grammar and spelling nazi (at least, I've been called such on social media). I'm also pretty forthright and can appear rude and dismissive in text. Some here in the Forth community who have met me in person would probably describe me that way in the flesh too. My apologies in advance.

Some Wiktionary references:

In reference to perceptor, I don't trust a dictionary that provides no prior use. The Oxford dictionary compilers have many millions of references for the million or so words in their magnificent dictionary. A perceptor is not in this dictionary. The only reliable references I can find online are for Latin and Spanish.

A lexeme is a collection of words, but you use it in the sense of one word. Forth has long abused the word parsing, but that too is now a matter of history. Let's not repeat that mistake; but equally, let's not introduce terms like lexing in place of parsing when it is not needed. I don't believe it adds any clarity at all.

I see that the verb «to ken» is close in meaning to «to perceive». I would prefer «to perceive» over «to ken»

I wasn't being that serious. I do like its brevity however; there aren't that many 3 letter words left to us that have suitable meanings. I prefer "recognize".

because the former has a counterpart noun «perceptor».

No it doesn't. See above.

@alextangent
Copy link

@alextangent, thank you for your feedback! Could you please suggest some naming options for the mentioned words?

My apologies, I just saw this today.

I haven't found the original RECOGNIZE objectionable apart from abbreviation and length.

The meaning of recognize is quite clear and doesn't require any other exposition

The abbreviation REC is in many ways already an accepted abbreviation of record, so shortening it to use as a prefix REC-xxx is a little ambiguous. But Forth doesn't have the concept of a record or a word remotely resembling it, so I'm OK with having a Forth specific context for it. If we ever need a RECORD, then we can sweat the details when discussing it.

As to length, REC- would seem to be an ideal prefix length at a reasonable 4, and we have names at up to 31 characters as standard if RECOGNIZE- at 10 characters is used as a prefix. Inconvenient bit not insurmountable.

I vote "recognizer", RECOGNIZE, REC- and so on.

@ruv
Copy link
Collaborator Author

ruv commented Dec 18, 2024

@alextangent wrote:

Find presumes the existence of the thing; search does not presume the existence of the thing (which may be found, but may not exist).

The term "existence" without its definition is too vague. If a word is not found, does it exist or not? 😉
Appealing to "existence" is a too weak argument.

But as I said, an academic point as it's a matter of history and too late to change.

Just out of curiosity, how would you change that?


Then your manifesto of clarity has failed in this case.

Let's make it more precise: the meaning is clear if the reader is familiar with the corresponding naming convention.

It should at least be xxx-ENSURE (as ensure is the verb form, then it is a compound verb or verb phrase). FIND-NAME-ENSURE would be better as FIND-NAME-ELSE-THROW as what we're doing is conditionally ensuring it's found or THROWing.

It seems, "verb-noun-verb" is not better (if not worse) than "verb-noun-adverb" as a verb phrase. For example: "take it down".
In addition, "sure" is shorter than "ensure".

FIND-NAME-ELSE-THROW is too long without any necessity. One additional English word should be enough.

Note: the Forth words ending with "-sure" were examples, I don't suggest them in the Recognizer API.

@ruv
Copy link
Collaborator Author

ruv commented Dec 18, 2024

@alextangent wrote:

A lexeme is a collection of words, but you use it in the sense of one word.

It's not me how introduced this term in computer science. You know, the same English word can have different meaning in different contexts and in different fields of knowledge. The English word "lexeme" has different meanings in linguistics and in computer science / compiler construction.

Forth has long abused the word parsing, but that too is now a matter of history.

Agreed. What Forth calls "parsing" is called "scanning" or "lexical analysis" in compiler theory.

Let's not repeat that mistake;

The term "lexeme" is used in compiler theory. What term do you suggest instead? ))

@ruv
Copy link
Collaborator Author

ruv commented Dec 18, 2024

@alextangent wrote:

Could you please suggest some naming options for the mentioned words?

I haven't found the original RECOGNIZE objectionable apart from abbreviation and length.

I believe, we need a term that denotes the recognizer that is used by the Forth text interpreter to translate a lexeme.

I suggested the term "perceptor" for that:

  • perceptor: the recognizer that is used by the Forth text interpreter to translate lexemes.

A counterpart term is:

  • perceive: to recognize a lexeme using the perceptor.

Could you please suggest other terms for that?

The first term should be a simple English noun suitable for naming the corresponding Forth word.
The second term should be a simple English verb.

By the way, the word "perceptor" is found in English books more often than "Forth language" 🙃

@alextangent
Copy link

@alextangent wrote:

Find presumes the existence of the thing; search does not presume the existence of the thing (which may be found, but may not exist).

The term "existence" without its definition is too vague. If a word is not found, does it exist or not? 😉 Appealing to "existence" is a too weak argument.

Sheesh.

But as I said, an academic point as it's a matter of history and too late to change.

Just out of curiosity, how would you change that?

I wouldn't.

Note: the Forth words ending with "-sure" were examples, I don't suggest them in the Recognizer API.

Thank goodness.

@alextangent
Copy link

@alextangent wrote:

Could you please suggest some naming options for the mentioned words?

I haven't found the original RECOGNIZE objectionable apart from abbreviation and length.

I believe, we need a term that denotes the recognizer that is used by the Forth text interpreter to translate a lexeme.

I don't believe we do.

I suggested the term "perceptor" for that:

  • perceptor: the recognizer that is used by the Forth text interpreter to translate lexemes.

A counterpart term is:

  • perceive: to recognize a lexeme using the perceptor.

Could you please suggest other terms for that?

Yes; but you won't consider it because it's in plainer if longer language (and I suspect you think it isn't priestly enough). A whole new lexicon is not needed.

I'm now repeating myself, so consider the discussion closed and carry on as you wish. Thanks.

@ruv
Copy link
Collaborator Author

ruv commented Dec 18, 2024

@alextangent wrote:

Could you please suggest other terms for that?

Yes; but you won't consider it because it's in plainer if longer language

I would like to consider it. But I see only criticism, and no suggestions.

Thank you for the discussion :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
naming Choosing names, the art
Projects
None yet
Development

No branches or pull requests

2 participants