Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tact 2.0 RFC #249

Open
anton-trunov opened this issue Apr 13, 2024 · 25 comments
Open

Tact 2.0 RFC #249

anton-trunov opened this issue Apr 13, 2024 · 25 comments
Labels
big This is a hard task, more like a project and will take a while to implement discussion Ideas that are not fully formed and require discussion language design rfc Request for comments from community
Milestone

Comments

@anton-trunov
Copy link
Member

anton-trunov commented Apr 13, 2024

This is a very incomplete draft proposal for the next major version of Tact. Comments are most welcome.

Grammar

  • ignore ... (or something else) -- useful for parsing incomplete examples in docs, where the triple dots mean "some code skipped";
  • Do not allow using struct, message, contract, init, bounced, external, primitive, map, and receive, get (or whatever we change those to) as identifiers and make those keywords, e.g. let struct: Int = 0 will become illegal;
  • Either make the if statement always have parentheses around its condition, or drop parens for while, repeat, do-until;
  • Make ; a statement separator, not a statement terminator, but allow ; for the last statement in a block;
  • Commas (,) should separate fields in in struct/message declarations to make it consistent with struct/message definitions;
  • Compact Int type ascriptions for contract for storage variables: e.g. x : int32 instead of x : Int as int32;
  • Getter definitions should make it clear that getters are not accessible on-chain (see Tact 2.0 RFC #249 (comment));
  • Internal and external message receivers syntax should be more consistent (see Hackathon feedback - why can't receivers be simple functions? #9 (comment));
  • Writing to storage variables should look differently compared to analogous operations on temporary (stack) variables, e.g. storage_var <- expression;
  • Contract storage variables should be accessed using a keyword other than self, for example, storage.var (we might even want to hint at it syntactically with storage { var : Type ... } declaration inside a contract);
  • More concise syntax for map expressions and operations: map literals ({1: "foo", 2: "bar"}), map access (map[key]), map updates (map[key] = val and map[key] += inc, etc.);
  • Capitalize map type identifier: Map<K, V>;
  • Equality comparisons via hashing should be explicit. We compare cells and slices via hashing but use == / != which hides this fact. Switching to something like ==# / !=# makes it more obvious;
  • Syntactic support for sending messages, e.g. instead of send(SendParameters{value: amount, to: self.owner, mode: mode}) we can have something like send {value: amount, to: self.owner, mode: mode};
  • If a send can actually deploy a contract, it should be evident in the call, something like deploy {...} (e.g. send wouldn't be able to deploy, only deploy could);
  • sender() in contract initializers (init) should be called deployer();
  • Remove multi-line /**/ comments in favor of semantic single-line comments, see: Tact 2.0 RFC #249 (comment)

Semantical changes

  • Disallow using null to mean "empty map" (actually we might want to rethink our design of optionals and remove null completely);
  • Make == / != only accept expressions of the same type: no more implicit type conversions allowing comparing Int and Int?;
  • Disallow initializing maps with emptyMap by default, in other words m: map<Int, Int>; without the corresponding entry in init should produce a compilation error;
  • Disallow map comparisons with == and !=: it should be evident that these are not trivial operations in terms of gas consumption;
  • Send mode should get its own type, not Int;
  • Fix SendPayGasSeparately naming, see SendPayGasSeparately is a wrong naming #149;
  • Introduce the concept of namespace;
  • Imports should support namespaces;
  • Struct and (not yet implemented) enums should create their own namespace;
  • send should support sending StateInit directly without destructuring it into the code and data parts (this can, for instance, simplify the forward method from the Base trait);
  • introduce the Time type and possibly TimeInterval type too, plus some primitives for working with absolute time and time intervals;
  • The now() builtin should return values of the type Time;
  • Think of a consistent design for contract interfaces, for instance Tact v1.x allows messages, empty receivers and strings to contribute to a contract's interface, thus making it harder to specify the interface in a standalone file (as a set of messages it supports);
  • Disallow struct fields with default value: everything should be explicit and if there is need to initialize struct fields with the same value for many instances, then it might be a design issue;
  • Introduce the concept of a reference to pass compound objects to functions without copying;
  • Module system for better management of imports, which also would help with avoiding name clashes;

FunC, Asm, support and interop

  • asm blocks support (see here)

Standard library, base trait, built-in functions

Tooling

  • Tool to help migrate Tact v1 projects to Tact v2
@anton-trunov anton-trunov added discussion Ideas that are not fully formed and require discussion language design rfc Request for comments from community labels Apr 13, 2024
@anton-trunov anton-trunov added this to the v2.0.0 milestone Apr 13, 2024
@novusnota
Copy link
Member

novusnota commented Apr 13, 2024

Couple of ideas:

  1. Hackathon feedback - why can't receivers be simple functions? #9 (comment)
  2. Promote get attribute to be its own function type to make it clear that they're special:
contract Example {
    field: Int;
    
    // Like this (removed `fun` keyword here):
    get field(): Int {
        return self.field;
    }
    
    // Or maybe even like this, to make it clear that they're off-chain
    // (and would therefore be called "offchain functions" and not "getter functions"):
    off getField(): Int {
        return self.field;
    }
    offchain getField(): Int {
        return self.field;
    }
}

@novusnota
Copy link
Member

  • Commas (,) should separate fields in in struct/message declarations to make it consistent with struct/message definitions;

Not sure here, as this will also influence fields (persistent state variables) in contracts and traits — they use semicolons in their declarations as of now

@anton-trunov
Copy link
Member Author

Not sure here, as this will also influence fields (persistent state variables) in contracts and traits — they use semicolons in their declarations as of now

It's fine, though. Syntactically it's included between a pair of curly braces, so there won't be any grammar conflicts. And there is a nice principle saying that declarations and definitions should resemble each other. And it makes it resemble TS more, which is what we target in terms of syntax

@anton-trunov
Copy link
Member Author

Added a couple more bullet points:

  • Writing to storage variables should look differently compared to analogous operations on temporary (stack) variables, e.g. storage_var <- expression;
  • Contract storage variables should be accessed using a keyword other than self, for example, storage.var (we might even want to hint at it syntactically with storage { var : Type ... } declaration inside a contract);

@anton-trunov anton-trunov pinned this issue Apr 13, 2024
@anton-trunov
Copy link
Member Author

anton-trunov commented Apr 13, 2024

  • More concise syntax for map expressions and operations: map literals ({1: "foo", 2: "bar"}), map access (map[key]), map updates (map[key] = val and map[key] += inc, etc.)

@anton-trunov
Copy link
Member Author

  • Syntactic support for sending messages, e.g. instead of send(SendParameters{value: amount, to: self.owner, mode: mode}) we can have something like send {value: amount, to: self.owner, mode: mode};

@anton-trunov
Copy link
Member Author

  • Disallow initializing maps with emptyMap by default, in other words m: map<Int, Int>; without the corresponding entry in init should produce a compilation error;

@novusnota
Copy link
Member

  • Tool to help migrate Tact v1 projects to Tact v2

We can use ast-grep, which uses Tree-sitter to perform AST-based search and replace. Additionally, it can lint code with the visually same error reports like Rust does, provided that we specify linting rules — this can be a starting point for our linters.

Apart from CLI, it also has a Node.js binding @ast-grep/napi with jQuery-like utility methods to traverse and manipulate syntax tree nodes

@anton-trunov
Copy link
Member Author

@novusnota I was thinking we can have a tool that combines Tact v1 parser and Tact v2 code formatter: this way we can migrate contracts if we only have local syntactic changes. Plus we can do some more massaging of the original contract, like explicit map initializations.

@novusnota
Copy link
Member

novusnota commented Apr 13, 2024

@anton-trunov Oh, so you want for code formatter to not only format, but also perform lint-like fixes? I thought of formatter purely in terms of whitespace fixing, and leaving the actual code replacements to the tact lint --fix or similar

@anton-trunov
Copy link
Member Author

  • Do not allow using struct, message, contract, init, bounced, external, primitive, map, and receive, get (or whatever we change those to) as identifiers and make those keywords, e.g. let struct: Int = 0 will become illegal;

@anton-trunov
Copy link
Member Author

Oh, so you want for code formatter to not only format, but also perform lint-like fixes?

@novusnota Not at all, I was talking about creating a tool like tact-migrate-v1-to-v2 which would combine the parser for Tact v1 and the future code formatter for Tact v2 + plus some more automatic migrations that are possibly not purely syntactical

@0kenx
Copy link
Contributor

0kenx commented Apr 15, 2024

Disallow initializing maps with emptyMap by default, in other words m: map<Int, Int>; without the corresponding entry in init should produce a compilation error;

Why? I have multiple use cases where I need to declare an empty map without adding any entries in init (eg. RBAC).

@anton-trunov
Copy link
Member Author

Why? I have multiple use cases where I need to declare an empty map without adding any entries in init (eg. RBAC).

Semantically nothing will be changed here, the user will have to just explicitly state the initial value for map. This part should be migrated automatically

@0kenx
Copy link
Contributor

0kenx commented Apr 15, 2024

Any plans to revive this? https://github.com/tact-lang/docs-obsolete/blob/main/tact-design.md#interfaces

I'd love to see impl MyTrait for MyStruct { }

@anton-trunov
Copy link
Member Author

Yep, we definitely need adhoc polymorphism in Tact

@novusnota
Copy link
Member

novusnota commented Apr 21, 2024

Not sure here, as this will also influence fields (persistent state variables) in contracts and traits — they use semicolons in their declarations as of now

It's fine, though. Syntactically it's included between a pair of curly braces, so there won't be any grammar conflicts. And there is a nice principle saying that declarations and definitions should resemble each other. And it makes it resemble TS more, which is what we target in terms of syntax

Well, as struct and message are essentially declaring a new type and we're trying to target TS in terms of syntax, we should also consider that in TS type declarations use semicolons (like we do) and not commas, see:

image

@novusnota
Copy link
Member

Albeit it's just a suggestion — do we need a bounced<> type wrapper going forward? Maybe, having the bounced() receiver is clear enough?

@anton-trunov anton-trunov added the big This is a hard task, more like a project and will take a while to implement label Apr 24, 2024
@novusnota
Copy link
Member

novusnota commented Apr 25, 2024

... as struct and message are essentially declaring a new type ...

Regarning struct vs. message — users of Tact may have a confusion between message as type definitions and messages as means of contract interaction on the blockchain.

I'm not sure if that's a valid issue (need to gather community feedback here), but it may be beneficial to merge struct and message definitions under the same name, say, struct or even tuple, to later make it closer to an advanced successor of TVMs tuples.

That way we could clarify things and make it simpler to read and write Tact code. And that would also alleviate the need to capitalize Messages/Structs in the docs to refer to the types and not to the communication itself :)

@0kenx
Copy link
Contributor

0kenx commented Apr 25, 2024

I think the current definition of struct and message is clear enough and they offer clear separation of usage scenarios.

If we were to merge the two then what would become message(0x1234) M {}? Does struct(0x1234) M {} even make sense?

@andreypfau
Copy link

andreypfau commented Apr 30, 2024

Equality comparisons via hashing should be explicit. We compare cells and slices via hashing but use == / != which hides this fact. Switching to something like ==# / !=# makes it more obvious;

Maybe better to make like in Kotlin/JS/TS/PHP using === and !=== operator? Its has also ligatures in most of monospace fonts

@novusnota
Copy link
Member

novusnota commented May 9, 2024

Suggestion:

Get rid of multi-line /**/ comments. Instead, use variations of single-line comments //:

  • //! as the top-level comments describing the current file (may be omitted, I guess)
  • and /// (or regular //, for super-simplification!) as documentation comments for the lines that follow, as it's done in Zig, Rust, Solidity, and, to some extent, Dart.

Motivation:

  1. Current multiline doc-comments are, in fact, written as if they're multiple single-line comments, because every line is prefixed by *. It's very redundant fact of JSDoc and JavaDoc, which removes the whole point (IMO) of having them over just a bunch of consecutive single-line ones.

  2. It's much easier for users to type and maintain single-line // comments and their series, especially without the advanced help from the editor. More comments, more documentation, more clarity!

  3. It's much easier for us to NOT have multi-line comments, as they're the only token in Tact which can span multiple lines and completely change the list of tokens and parsed AST of the rest of the file as soon as the beginning /* appears. Not having any multi-line tokens would significantly ease the lexing phase from the perspective of incremental updates (we can now lex and re-lex by lines, essentially making way for lexing in-parallel, like Zig does), but this would also improve the performance of lexing and parsing phases! Blazing-fast AST generation for compiler and for tooling.

@anton-trunov
Copy link
Member Author

Great suggestion! Looks like it's about time to move this issue to its own repo and continue discussions in separate issues there. Wdyt?

@novusnota
Copy link
Member

Well, yeah, this can be moved to tact-lang/roadmap or to some more RFC-specific repo and then referenced from this issue

@novusnota
Copy link
Member

novusnota commented May 25, 2024

We may try to add direct TVM assembly function wrappers, similar to how it can be done in FunC: RETALT example. This will remove the need to write them in FunC, and then import and wrap in a native function in Tact — we would just write assembly ourselves.

Perhaps, it's best to add another attribute to native functions in Tact, so that we have @asm("...") in addition to the existing @name(func_name_here):

@asm("RETALT")
native returnAlt();

// or without a string literal, similar to @name(...)

@asm(RETALT)
native returnAlt();

// and, perhaps, also allow replacing parentheses () with braces {}
// to write multi-line assembly right between them

@asm{
    <{
    }>CONT // c
    0 SETNUMARGS // c'
    2 PUSHINT // c' 2
    SWAP // 2 c'
    1 -1 SETCONTARGS
} native stackOverflow();

Syntax aspects are up to discussion though, not 100% sure how this should be arranged :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
big This is a hard task, more like a project and will take a while to implement discussion Ideas that are not fully formed and require discussion language design rfc Request for comments from community
Projects
None yet
Development

No branches or pull requests

4 participants