Skip to content

[EXPERIMENT] Add node-id based ECMA semantics#11818

Draft
kdy1 wants to merge 2 commits intoswc-project:mainfrom
kdy1:kdy1/node-id-semantics
Draft

[EXPERIMENT] Add node-id based ECMA semantics#11818
kdy1 wants to merge 2 commits intoswc-project:mainfrom
kdy1:kdy1/node-id-semantics

Conversation

@kdy1
Copy link
Copy Markdown
Member

@kdy1 kdy1 commented Apr 24, 2026

Summary

  • Add NodeId to swc_ecma_ast::Ident as an occurrence id with NodeId(0) reserved as invalid/default.
  • Add swc_ecma_transforms_base::semantics with fresh node-id assignment plus ScopeId, SymbolId, and ReferenceId side tables.
  • Wire resolver entrypoints to assign fresh node ids, add SymbolId-based rename adapter APIs, and expose node-id access through IdentLike.
  • Assign fresh node ids at minifier analysis entrypoints so the new semantics path can be shared by follow-up internal migrations.

Notes

This is intentionally experimental. The AST and semantics foundation is in place, but the minifier's internal ProgramData variable keys are not fully migrated from legacy Id to SymbolId yet.

Test Plan

  • git submodule update --init --recursive
  • cargo test -p generate-code test_ecmascript -- --nocapture
  • cargo check -p swc_ecma_ast -p swc_ecma_visit -p swc_ecma_hooks
  • cargo check -p swc_ecma_transforms_base -p swc_ecma_utils -p swc_ecma_minifier
  • UPDATE=1 cargo test -p swc_ecma_transforms_base --test semantics -- --ignored
  • cargo test -p swc_ecma_transforms_base --test semantics -- --ignored
  • cargo test -p swc_ecma_transforms_base
  • cargo test -p swc_ecma_utils
  • cargo test -p swc_ecma_minifier
  • (cd crates/swc_ecma_minifier && UPDATE=1 ./scripts/test.sh)
  • (cd crates/swc_ecma_minifier && ./scripts/test.sh)
  • (cd crates/swc_ecma_minifier && ./scripts/exec.sh)
  • cargo check -p swc_ecma_transformer -p swc_typescript -p swc_ecma_transforms_module
  • cargo fmt --all
  • cargo clippy --all --all-targets -- -D warnings
  • cargo test -p swc_ecma_ast -p swc_ecma_visit -p swc_ecma_hooks -p swc_ecma_transformer -p swc_typescript -p swc_estree_compat
  • cargo test -p swc_bundler --lib

cargo test -p swc_bundler ... with integration tests was attempted, but swc_bundler --test deno requires a local deno executable and failed with No such file or directory.

Copilot AI review requested due to automatic review settings April 24, 2026 06:22
@kdy1 kdy1 requested review from a team as code owners April 24, 2026 06:22
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 24, 2026

⚠️ No Changeset found

Latest commit: d3f14b7

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@kdy1 kdy1 marked this pull request as draft April 24, 2026 06:23
@kdy1
Copy link
Copy Markdown
Member Author

kdy1 commented Apr 24, 2026

I wanted to profile this approach using Codspeed

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Introduces experimental node-id–based semantic analysis across SWC’s ECMA AST by adding NodeId to Ident, generating visitor/hook support, and wiring resolver/minifier/rename entrypoints to assign node ids and enable symbol-based renaming.

Changes:

  • Add NodeId (occurrence id) to swc_ecma_ast::Ident, plus visitor/hook generated support for traversing it.
  • Introduce swc_ecma_transforms_base::semantics with node-id assignment and scope/symbol/reference side tables + new semantics fixture tests.
  • Wire resolver and minifier analysis entrypoints to assign fresh node ids; add SymbolId-based rename adapter APIs.

Reviewed changes

Copilot reviewed 31 out of 31 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
crates/swc_typescript/src/fast_dts/enum.rs Populate new Ident.node_id field for synthesized identifiers.
crates/swc_estree_compat/src/swcify/expr.rs Populate new Ident.node_id field during ESTree → SWC conversion.
crates/swc_ecma_visit/src/generated.rs Generated visitor/fold/path plumbing for NodeId and Ident.node_id.
crates/swc_ecma_utils/src/ident.rs Extend IdentLike with node_id() and implement it for common identifier-like types.
crates/swc_ecma_transforms_base/tests/semantics.rs Add harness to snapshot-dump new semantics output from fixtures.
crates/swc_ecma_transforms_base/tests/node-id-semantics/unresolved/input.js Fixture exercising unresolved references.
crates/swc_ecma_transforms_base/tests/node-id-semantics/unresolved/output.txt Expected snapshot output for unresolved fixture.
crates/swc_ecma_transforms_base/tests/node-id-semantics/ts-type-value/input.ts Fixture exercising TS type/value namespaces.
crates/swc_ecma_transforms_base/tests/node-id-semantics/ts-type-value/output.txt Expected snapshot output for TS type/value fixture.
crates/swc_ecma_transforms_base/tests/node-id-semantics/shadow/input.js Fixture exercising shadowing across scopes.
crates/swc_ecma_transforms_base/tests/node-id-semantics/shadow/output.txt Expected snapshot output for shadowing fixture.
crates/swc_ecma_transforms_base/tests/node-id-semantics/imports-exports/input.js Fixture exercising imports/exports.
crates/swc_ecma_transforms_base/tests/node-id-semantics/imports-exports/output.txt Expected snapshot output for imports/exports fixture.
crates/swc_ecma_transforms_base/tests/node-id-semantics/class-function-expr/input.js Fixture exercising named class/function expressions.
crates/swc_ecma_transforms_base/tests/node-id-semantics/class-function-expr/output.txt Expected snapshot output for class/function expr fixture.
crates/swc_ecma_transforms_base/tests/node-id-semantics/catch-param-var/input.js Fixture exercising catch param vs var-hoist interaction.
crates/swc_ecma_transforms_base/tests/node-id-semantics/catch-param-var/output.txt Expected snapshot output for catch/var fixture.
crates/swc_ecma_transforms_base/src/semantics.rs New node-id assignment + semantic analysis implementation (scopes/symbols/references).
crates/swc_ecma_transforms_base/src/resolver/mod.rs Assign node ids once at resolver entrypoints.
crates/swc_ecma_transforms_base/src/rename/ops.rs Add a SymbolOperator that renames using Semantics symbol resolution.
crates/swc_ecma_transforms_base/src/rename/mod.rs Expose SymbolRenameMap and rename_symbols API.
crates/swc_ecma_transforms_base/src/lib.rs Export new semantics module.
crates/swc_ecma_transformer/src/es2016/exponentiation_operator.rs Populate new Ident.node_id field for synthesized Math ident.
crates/swc_ecma_minifier/src/pass/mangle_props.rs Assign node ids before property-mangler analysis entrypoint.
crates/swc_ecma_minifier/src/lib.rs Assign node ids at minifier optimize entrypoint.
crates/swc_ecma_minifier/src/compress/mod.rs Assign node ids before minifier ProgramData analysis passes.
crates/swc_ecma_hooks/src/generated.rs Generated hook plumbing for NodeId enter/exit events.
crates/swc_ecma_ast/src/node_id.rs New NodeId type and invariants (0 reserved invalid).
crates/swc_ecma_ast/src/lib.rs Export NodeId and include new module.
crates/swc_ecma_ast/src/ident.rs Add node_id: NodeId field to Ident and update constructors/conversions.
crates/swc_bundler/src/id.rs Extend IdentLike impl for bundler Id with node_id() returning invalid.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 90 to 93
if self.options.hoist_vars || self.options.hoist_fns {
assign_node_ids(n);
let data = analyze(&*n, Some(self.marks), false);

Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assign_node_ids(n) adds a full AST traversal immediately before program_data::analyze, but ProgramData analysis currently keys off (Atom, SyntaxContext) and does not use Ident.node_id. If node ids are only needed for the new experimental semantics path, consider guarding this behind a config/feature or deferring assignment until a node-id consumer is actually used, to avoid unnecessary overhead.

Copilot uses AI. Check for mistakes.
Comment on lines 126 to 129

assign_node_ids(m);
let mut data = analyze(&*m, None, true);

Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assign_node_ids(m) adds an extra full AST traversal here, but the subsequent program_data::analyze/property atom collection does not currently use Ident.node_id (it relies on symbol + SyntaxContext). If this is only for the upcoming node-id semantics migration, consider gating it behind a flag or moving it to the code path that actually consumes node ids to avoid the extra walk.

Copilot uses AI. Check for mistakes.

pub trait IdentLike: Sized + Send + Sync + 'static {
fn from_ident(i: &Ident) -> Self;
fn node_id(&self) -> NodeId;
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IdentLike gained a new required method (node_id) without a default implementation, which is a breaking change for downstream crates implementing this public trait. Consider providing a default implementation (e.g. returning an invalid NodeId) so existing implementors keep compiling, and only override it where meaningful.

Suggested change
fn node_id(&self) -> NodeId;
fn node_id(&self) -> NodeId {
NodeId::invalid()
}

Copilot uses AI. Check for mistakes.
self.scope_stack
.iter()
.find(|frame| frame.id == scope)
.and_then(|frame| frame.decls.get(&(name.clone(), namespace)).copied())
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

find_in_scope clones Atom just to perform a map lookup (name.clone()), which can add noticeable overhead in large ASTs. Consider restructuring decls so lookups can be done by borrowing (e.g. nested maps keyed by Atom/namespace, or a custom key type with Borrow) to avoid per-lookup cloning.

Suggested change
.and_then(|frame| frame.decls.get(&(name.clone(), namespace)).copied())
.and_then(|frame| {
frame
.decls
.iter()
.find(|((decl_name, decl_namespace), _)| {
decl_name == name && *decl_namespace == namespace
})
.map(|(_, symbol)| *symbol)
})

Copilot uses AI. Check for mistakes.
Comment on lines +488 to +492
fn resolve_symbol(&self, name: &Atom, namespace: SymbolNamespace) -> Option<SymbolId> {
for frame in self.scope_stack.iter().rev() {
if let Some(symbol) = frame.decls.get(&(name.clone(), namespace)) {
return Some(*symbol);
}
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolve_symbol clones Atom on every scope-frame lookup (name.clone() inside the loop). This makes resolution O(depth) and allocates/clones per frame. Consider changing the decls keying strategy to allow borrowed lookups so resolution doesn't require cloning for each frame.

Copilot uses AI. Check for mistakes.
Comment on lines +1260 to +1261
fn visit_import_decl(&mut self, import: &ImportDecl) {
self.predeclare_import(import);
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imports are predeclared in predeclare_module_decl, but visit_import_decl also calls predeclare_import, which re-walks and re-declares the same bindings. This is redundant work for every import; consider making one of these paths a no-op (e.g. only predeclare in the initial scan, and skip during the main visit).

Suggested change
fn visit_import_decl(&mut self, import: &ImportDecl) {
self.predeclare_import(import);
fn visit_import_decl(&mut self, _import: &ImportDecl) {
// Import bindings are predeclared during the initial module scan.
// Skip redeclaring them during the main visit.

Copilot uses AI. Check for mistakes.
typed!(visit_mut_ts_namespace_export_decl, TsNamespaceExportDecl);

fn visit_mut_program(&mut self, program: &mut Program) {
self.assign_node_ids_once(program);
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolver now performs an extra full-tree traversal to assign NodeIds before doing its normal work. If most resolver callers don't need node ids, consider making node-id assignment opt-in (config flag/feature) or exposing it as a separate pass, to avoid doubling traversal cost by default.

Suggested change
self.assign_node_ids_once(program);
#[cfg(feature = "resolver_node_ids")]
{
self.assign_node_ids_once(program);
}

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Copy Markdown
Contributor

Binary Sizes

File Size
swc.linux-x64-gnu.node 27M (27841736 bytes)

Commit: ba84514

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d3f14b71ec

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +773 to +777
&decl.ident,
self.current_scope(),
SymbolKind::Class,
SymbolNamespace::Value,
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Register class symbols in type namespace

With SemanticConfig { typescript: true }, class names are valid type references (for example let x: C), but this declaration path only records Decl::Class in SymbolNamespace::Value. Later type-resolution paths (such as visit_ts_type_ref) look up SymbolNamespace::Type, so class type uses are marked unresolved and downstream SymbolId-based passes can miss those bindings. Add a corresponding type-namespace declaration for class symbols when TypeScript analysis is enabled.

Useful? React with 👍 / 👎.

Comment on lines +641 to +644
let new_sym = new_id.atom();
if ident.sym.eq(new_sym) {
return;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Apply ctxt-only renames in SymbolOperator

This early return skips updates whenever the target atom matches, but RenamedVariable also carries SyntaxContext. If a caller needs to retag a symbol without changing its text (same atom, different ctxt), the rename silently becomes a no-op and references keep the old binding identity. The condition should account for ctxt too (or still assign ctxt when atoms are equal).

Useful? React with 👍 / 👎.

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Apr 24, 2026

Merging this PR will degrade performance by 13.2%

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 1 improved benchmark
❌ 18 regressed benchmarks
✅ 200 untouched benchmarks
⏩ 31 skipped benchmarks1

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Benchmark BASE HEAD Efficiency
es/large/minify/libraries/antd 2.6 s 2.7 s -3.12%
es/large/minify/libraries/d3 544 ms 556.6 ms -2.27%
es/large/all/es2018 120 ms 123.1 ms -2.48%
es/large/all/es2016 121.7 ms 124.8 ms -2.54%
es/large/minify/libraries/echarts 2.1 s 2.2 s -3%
es/large/minify/libraries/three 894.7 ms 925.1 ms -3.28%
es/large/minify/libraries/terser 511.9 ms 530.1 ms -3.44%
es/large/minify/libraries/typescript 5 s 5.2 s -3.03%
es/large/minify/libraries/victory 1.2 s 1.3 s -2.94%
es/minifier/libs/antd 2 s 2.1 s -4.12%
es/minifier/libs/three 668.5 ms 696 ms -3.94%
es/minifier/libs/victory 895.1 ms 935.2 ms -4.29%
es/minifier/libs/typescript 4 s 4.1 s -3.87%
es/minifier/libs/echarts 1.6 s 1.7 s -3.93%
es/minifier/libs/terser 375.1 ms 387.8 ms -3.28%
es/full-target/es2018 588.8 µs 575.8 µs +2.26%
es/full-target/es2017 629.6 µs 647.5 µs -2.76%
es/resolver_with_hygiene/typescript 758 ms 795.9 ms -4.75%
es/resolver/typescript 248.2 ms 286 ms -13.2%

Comparing kdy1:kdy1/node-id-semantics (d3f14b7) with main (fe38342)

Open in CodSpeed

Footnotes

  1. 31 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants