Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a "type" checker to Forth #79

Open
massung opened this issue Jan 10, 2019 · 11 comments
Open

Adding a "type" checker to Forth #79

massung opened this issue Jan 10, 2019 · 11 comments
Labels
implementation How to implement a Forth system typesystem Data types in Forth

Comments

@massung
Copy link
Member

massung commented Jan 10, 2019

One toy project I've played around with in the past is the adding a "type" checker to Forth. I don't mean that in the sense of types like C and a compiler, but rather just confirming inputs/outputs by assuming the stack notation identified actual types. For example:

: c" ( -- caddr )
: s" ( -- caddr n )
: dup ( x -- x x )
: + ( n n -- n )
: /string ( caddr n n -- caddr n )

Assuming that some "types" (e.g. x or _) were pre-defined to mean "what's there", and any other symbol was just a type created on-demand (no pre-declaring of types being necessary), it should be possible to take something like s" Hello, world!" 10 dup + /string and validate that all the stack values and output types match with the required input types specified by each word. Likewise, if I'd used c" instead by accident, it should have discovered at /string that the expected stack values didn't match the required inputs.

A few times I've gone down this road, trying it out for fun, but kept getting hung-up on in a couple areas:

  • Dynamic words like execute totally borking the stack. The only idea I had was to assume the programmer knew what they were doing and pretend the rest of the word was "good". For example:
: foo ( n -- caddr ) ['] bar execute do something ;

After execute, just skip type checking until ; and then assume the stack is updated to whatever foo says it should be. There was an instance where this failed, but I can't recall what it was.

  • Branching return values (e.g. find). These almost always require context to correctly deduce. And I couldn't ever come up with a checking strategy that "just worked".

With find for example, the second stack item could be caddr or xt, and which it is depends on the top stack item's value (not type). The best I could come up with was just branching the type checking and at each step determining if either branch was successful.

Sorry, long post, thinking out-loud remembering a past bit of work I did, but wondering if anyone here had ever tried something similar (or knew of a similar attempt)? How did it turn out? Was it useful? Do you think something like this would be valuable in your own forth system?

@massung
Copy link
Member Author

massung commented Jan 10, 2019

Just remembering one other issue I had:

  • Loops that didn't leave the stack balanced.

A word like : iota ( n -- i*x ) do i loop ; would leave an indeterminate number of items on the stack, and that ended up causing enumerable headaches trying to resolve. I never came up with a good solution to this one.

@alexshpilkin
Copy link
Member

alexshpilkin commented Jan 11, 2019 via email

@MitchBradley
Copy link

You can't solve every problem so pick a problem that you can solve that has a worthwhile payoff, solve it, and iterate.
Einstein failed to find a theory of everything, but the theories that he did find are pretty darned useful.
I once did a quick and dirty run-time checker that just looked for stack balance. It included an abbreviated representation of the stack diagram in the compiled representation of the word. If I recall correctly, I just recorded the number of arguments and the number of results. Upon entry, the stack was checked for enough depth for that many arguments, and upon exit, for the correct net depth change. For complex cases I just punted and didn't check. It helped find some errors in a large system. Good value for a modest amount of effort.

@jwoehr
Copy link

jwoehr commented Jan 11, 2019 via email

@alexshpilkin
Copy link
Member

@MitchBradley wrote:

You can't solve every problem so pick a problem that you can solve that has a worthwhile payoff, solve it, and iterate.

Well, I did warn that my definition of “works” is called something between “too radical” and “useless” by many people, including people who are vastly better at this than I am. I’m aware I have an unhealthy attachment to formal systems (physics does that to you). So, no contest here.

Einstein failed to find a theory of everything, but the theories that he did find are pretty darned useful.

I must take offense at this, though: SR, GR, whatever other modern physical theory you care to name are limited, yes, but the way we use them now, they come with a well-defined range of validity built in. More specifically, a theory consists of a more or less internally consistent formal model plus a prescription in which situations the formal model will describe what will actually happen. A physicist always has a sense of whether a particular description will or will not work—even if it’s not possible to check this in advance, you at least know what you’ll be checking as you go.

So I don’t think it’s fair to say modern physics does demand-driven, ad hoc iterative improvement in the same way that e.g. compiler optimizers do; internal consistency (compositionality?) at each step is also important. Maybe engineering does, I’ve honestly no idea.

If we try to transplant it back to type systems, it’ll probably go like this: a type system doesn’t have to be all-singing and all-dancing, but it’d better be explicit and (definitionally, not operationally) simple about what it will and won’t do. If a type is just (# cells in, # cells out), execute will be untypeable, and this is fine: “only first-order words are typeable“ is a reasonable rule; but if I can write execute without noticing it, without a trustMe or unsafeCoerce or whatever, it’ll make me have PHP nightmares^W^W^Wsad.

@jwoehr wrote:

Right, "if we had $10 for every time someone suggested adding type checking
to Forth", right, Mitch?

Not to say there’s anything wrong with your statement in particular, but I think it’s not fair to place the blame completely on the suggesters, either, because of a general problem:

The literature sucks.

I mean, what do you have about Forth that counts as a real review of the design space, hell, as a honest-to-God literature review of any kind? Thinking Forth (even if it’s half advice on running a software consultancy), Moving Forth (best review ever to not call itself a review), a couple more tutorial implementations, a summary article on the CASE contest and... that’s it?

I can read a Forth Dimensions issue or ten; some JFAR articles; no FORML articles because, well, fat chance I can access any of them... But that’s not reasonable; I’m not an actual programmer precisely because I enjoy this sort of thing too much. Someone who’s better at, y’know, actually building stuff won’t do that!

And that is why you need to have literature reviews. If you compare with science, it seems that the Forth literature is perhaps small enough for one person to be able to read it all (like linguistics; unlike any area of physics), but that presumes reading things is (as in science) one of their primary activities, which I don’t think it is.

So, I’m aware everything has probably been discussed to hell and back... But can we please have some references?

@alexshpilkin
Copy link
Member

(I don’t know you, so sorry if that came of as too flippant... the desire to make a cutely arranged argument overcame the effort to make a polite one.)

@jwoehr
Copy link

jwoehr commented Jan 11, 2019 via email

@cwpjr
Copy link
Member

cwpjr commented Jan 14, 2019 via email

@MitchBradley
Copy link

When I first got interested in Forth, I think it was in 1982, I read everything I could find about Forth. Back issues of Forth Dimensions, FORML Proceedings, JFAR issues, ... I was hoping not to reinvent the wheel. I was able to access all that because the primary distributor of Forth literature, Mountain View Press, was run out of Glen Haydon's garage, which happened to be a few blocks away from my house!

Since then I have seen just about every imaginable wheel being reinvented innumerable times. That's sort of inevitable, given the time it would take to delve through all the history.

In answer to the request for a reference: For some early work on Forth type checking, have a look at Peter Knaggs' PhD thesis

@alexshpilkin
Copy link
Member

@MitchBradley wrote:

For some early work on Forth type checking, have a look at Peter Knaggs' PhD thesis

Thanks for the reference! Somehow I missed until now that Knaggs is also the one Kleffner credits with the polymorphic type system in section 4 of the talk I linked to before, although the latter describes it in terms much closer to the PL literature (indeed, the former seems unaware of the word “polymorphism” itself).

@massung: For some reason I forgot to say this in my initial post, so here goes: the rest of RK’s talk (though phrased in Forth-unfriendly terms) works up to a solution due to Diggins (2008) of the problem of (in Forth-friendly terms) typing EXECUTE (as well as functional composition >R EXECUTE R> EXECUTE and so on).

On a very different note, nobody here has mentioned StrongForth yet (and I only just read the manual earlier today). This seems to be a system heavily focused on ad-hoc overloading and of a very different style in general: it seems to me that a StrongForth program without its types cannot be given meaning at all (i.e. this is a Church-style system, unlike the systems of Knaggs and Diggins that can work Curry-style, on existing untyped Forth code). I’m still a bit confused by it, though, so please correct me if I’m wrong.

@olleharstedt
Copy link

olleharstedt commented Oct 25, 2022

Nice read :)

I recently got recommended Forth and of course got trapped in the same bucket as every other newbie. My personal attempt to add types to Forth was to embed it like the Haskell example above, but using phantom types + a syntax extension to remove ugly notation.

Original (and trivial) Forth snippet:

1 2 + .

In OCaml, explicitly passing around the stack state:

(((start () |> num 1) |> num 2) |> plus) |> dot

In the syntax extension (written by Chet Murthy; still inside normal OCaml code):

[%forth 1 2 plus dot;]

Maybe a limited approach. The functions have to be typed explicitly, like:

val plus : ([`number] * ([`number] * 'a)) t -> ([`number] * 'a) t

To allow proper Forth words, you'd need to add a lexer, which you can do while staying inside the OCaml environment. It could be made interactive too, by including the syntax extension in the so called toplevel.

A nice property of this approach is that it spits out "normal" Forth code. So it's a strict superset (or maybe rather a typed subset) of Forth.

Related forum thread: https://discuss.ocaml.org/t/can-i-write-a-ppx-to-replace-1-with-add-to-stack-1/10680/

@ruv ruv added typesystem Data types in Forth implementation How to implement a Forth system labels Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
implementation How to implement a Forth system typesystem Data types in Forth
Projects
None yet
Development

No branches or pull requests

7 participants