You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current downstream implementation of late parsing for bounds-safety attributes (counted_by, sized_by, ended_by, etc.) first creates a version of the type without late-parsed attributes, tracking late parsing information in a separate data structure along with the nested type index. When late parsing is triggered (e.g., after the whole structure body is parsed), it walks through the declaration type's nested type nodes to the level indicated by the index and inserts the wrapper type (e.g., CountAttributedType).
The upstream PR llvm#179612 ("[BoundsSafety] Support bounds-safety attributes in type positions") introduces a new approach for handling these attributes in type positions.
Goal
Adopt the upstream approach downstream. The upstream PR only supports counted_by on struct fields. Downstream, we additionally support these attributes on function parameters and more late-parsed attributes like ended_by. Adopting the upstream approach means adjusting all of this downstream code to use the same mechanism consistently.
I have locally verified that this approach can be successfully used to attach the counted_by attribute to function parameters, confirming the result in AST dump.
Approach
Instead of tracking a type position index for late-parsed attributes, create a placeholder type (LateParsedAttrType) during initial type construction. When late parsing occurs, TreeTransform rebuilds the declaration, replacing the placeholder with the concrete type (e.g., CountAttributedType). This avoids index-tracking issues with complex types (e.g., templated C++ types) and was agreed upon with @AaronBallman and other Clang contributors.
Key elements of the upstream PR:
Introduces LateParsedAttrType placeholder type to defer attribute processing until the complete struct/function definition is available
Moves LateParsedDeclaration and LateParsedAttribute out of the Parser class
Moves LateParsedAttrList out of Parser.h into DeclSpec.h
Introduces LateParsedTypeAttribute (child of LateParsedAttribute) for type attribute handling
Adds LateParsedAttribute vectors to DeclaratorChunk, Declarator, and DeclSpec
Resolves placeholders via TreeTransform to concrete types (e.g., CountAttributedType)
Execution Plan
Properly cherry-picking and transitioning to this approach requires a large change downstream, so it should be done incrementally. The tasks below are refactoring steps that can be landed independently before actually switching to the new approach.
Each task will have its own GitHub issue so they can be worked on in parallel. I am using Claude Code agents to orchestrate tasks and help prepare preliminary patches for individual sub-tasks where possible, so contributors can easily pick them up and polish.
Once the refactoring is complete, features not yet available upstream (such as function parameter support and ended_by) can be incrementally upstreamed.
Terminology
LateParsedAttrType — AST type node; placeholder in the type system
LateParsedTypeAttribute — Data structure subtyping LateParsedAttribute (not an AST node)
LateParsedAttrInfo — Current DeclaratorChunk workaround (to be removed)
Tasks
Validation: All tasks must pass existing lit tests (llvm-lit --filter='counted-by|bounds-safety|sized-by' clang/test and llvm-lit clang/test/BoundsSafety). Each task should be self-contained and land independently.
Where possible, tasks are structured to cherry-pick parts of the upstream PR llvm#179612 independently, marked with (cherry-pick). I will split the upstream PR into smaller PRs/patches so each can be cherry-picked to the corresponding task. Tasks without this marker are downstream-only refactoring needed to prepare for the new approach.
Phase 0 — Independent Foundations (parallel)
T1: Move late parsing for parameters trigger from ParseDeclaratorInternal to ActOnFunctionDeclarator — [BoundsSafety] Move late parsing for parameters trigger from ParseDeclaratorInternal to ActOnFunctionDeclarator #12766
Currently, late parsing for parameters is triggered early during declarator parsing to prevent FunctionDecl from being immediately merged with a previous declaration when the bounds attributes differ. This should change because the new approach first creates LateParsedAttrType placeholders during declarator parsing and replaces them after the declaration is fully formed. Moving the trigger to after the full FunctionDecl is created validates that late parsing for parameters works from the new site. Function redeclaration must be verified.
T4: Extract diagnostics from ConstructDynamicBoundType / ConstructCountAttributedType / ConstructDynamicRangePointerType — [BoundsSafety] Extract diagnostics from ConstructDynamicBoundType / ConstructCountAttributedType / ConstructDynamicRangePointerType #12765
These CRTP TypeVisitor classes in SemaDeclAttr.cpp walk the type to a nested level index and construct the bounds-attributed type there, mixing type construction with diagnostics. Extract diagnostics into flat functions that run before construction; the conditions that currently trigger diagnostics become invariants enforced by assert(). Why: The new approach constructs types via TreeTransform replacement rather than these index-based visitors. However, the original construction is still needed for APINotes and template instantiation. Extracting diagnostics makes them callable independently from both the old and new code paths. Note: This migration is not strictly necessary for T10. Another possible path is to create a separate diagnostic function for T10 and then refactor afterwards, in case it's going to block T10.
Move LateParsedDeclaration and LateParsedAttribute outside the Parser class (stay in Parser.h)
Move LateParsedAttrList from Parser.h to DeclSpec.h
Forward-declare LateParsedAttribute in DeclSpec.h (opaque)
Replace LateParsedAttrInfo usage in DeclaratorChunk with LateParsedAttribute*
Remove LateParsedAttrInfo
T6: Handle non-late-parsed counted_by/ended_by as type attributes in SemaType.cpp — [BoundsSafety] Handle non-late-parsed counted_by/ended_by as type attributes in SemaType.cpp #12767
Currently both late-parsed and non-late-parsed paths go through SemaDeclAttr.cpp (declaration attribute handling). Move the non-late-parsed path to SemaType.cpp (type attribute handling), while the late-parsed path remains in SemaDeclAttr.cpp until T10 lands. Needs investigation on whether this is feasible as a standalone change. If not, absorbed into T10. Why: The new approach treats these as type attributes, not declaration attributes. Migrating the non-late-parsed path first reduces the scope of T10.
Phase 1 — Data Structures + Diagnostic Reconciliation (parallel chains)
T8: Introduce LateParsedTypeAttribute(upstream/cherry-pick) — Upstreamed as [BoundsSafety][NFC] Introduce LateParsedTypeAttribute for late-parsed type attributes llvm/llvm-project#192799. Subtypes LateParsedAttribute for late-parsed type attributes. Independent of T7. Why: The new approach needs a distinct data structure to carry type-attribute-specific information (e.g., the pointer nesting level where the placeholder was inserted) through the late parsing pipeline.
T9: Introduce LateParsedAttrType — [BoundsSafety][NFC] Introduce LateParsedAttrType AST placeholder type #13000(upstream/cherry-pick) — This is a place holder type in AST to be replaced with a concrete type, e.g., CountAttributedType during late parsing. Depends on T8 because LateParsedAttrType embeds LateParsedTypeAttribute.
T5: Merge redundant diagnostics — Merge handlePtrCountedByEndedByAttr (BoundsSafety-enabled pass in SemaDeclAttr.cpp) and handleCountedByAttrField (default pass), which have overlapping diagnostic logic, including the functions they call. Also reconcile with CountArgChecker/RangeArgChecker in SemaType.cpp. Depends on T4.
Phase 2 — Diagnostic API Design
T2: Split diagnostics into DeclContext vs Type
DeclContext diagnostics (needs Decl) — run right after late parsing (whole struct / function prototype)
Type diagnostics (nested type related) — called from two places:
When attaching LateParsedAttrType (or before) — invariants for LateParsedAttrType checked here
When replacing LateParsedAttrType with concrete CountAttributedType and friends — invariants for CountAttributedType checked here; argument expression diagnostics here too
Why: The new approach has two distinct points where diagnostics run: (1) when the placeholder is inserted during type construction, and (2) when the placeholder is replaced as part of late parsing for a declaration. Each site needs a different subset of diagnostics. This split defines the API that T10 will use. Depends on T5.
T11: Refactor merged handler API for multiple callers — Refactor the merged handler so the core logic can be called from both APINotes (where Level is provided explicitly as a parameter) and the new late parsing logic once it's introduced (where the placeholder type already knows its position). Depends on T2.
Phase 3 — Convergence
T10: New late parsing mechanism with LateParsedAttrType(cherry-pick + extend)
Cherry-pick the remaining core of the upstream PR [BoundsSafety] Support bounds-safety attributes in type positions llvm/llvm-project#179612: LateParsedAttrType placeholder creation during processLateTypeAttrs, ProcessLateParsedTypeAttributesForFields to replace placeholders via RebuildTypeWithLateParsedAttr, and LateParsedAttribute vectors in DeclaratorChunk/Declarator/DeclSpec. On top of that, introduce ProcessLateParsedTypeAttributesForParameters for late parsing of attributes on function parameter types, and handle ended_by in a similar manner. This can be split into smaller sub-tasks. Absorbs T6 if not standalone. Replaces T1's transitional function. Depends on T1, T4, T9, T2, and T11.
Background
The current downstream implementation of late parsing for bounds-safety attributes (
counted_by,sized_by,ended_by, etc.) first creates a version of the type without late-parsed attributes, tracking late parsing information in a separate data structure along with the nested type index. When late parsing is triggered (e.g., after the whole structure body is parsed), it walks through the declaration type's nested type nodes to the level indicated by the index and inserts the wrapper type (e.g.,CountAttributedType).The upstream PR llvm#179612 ("[BoundsSafety] Support bounds-safety attributes in type positions") introduces a new approach for handling these attributes in type positions.
Goal
Adopt the upstream approach downstream. The upstream PR only supports
counted_byon struct fields. Downstream, we additionally support these attributes on function parameters and more late-parsed attributes likeended_by. Adopting the upstream approach means adjusting all of this downstream code to use the same mechanism consistently.I have locally verified that this approach can be successfully used to attach the
counted_byattribute to function parameters, confirming the result in AST dump.Approach
Instead of tracking a type position index for late-parsed attributes, create a placeholder type (
LateParsedAttrType) during initial type construction. When late parsing occurs,TreeTransformrebuilds the declaration, replacing the placeholder with the concrete type (e.g.,CountAttributedType). This avoids index-tracking issues with complex types (e.g., templated C++ types) and was agreed upon with @AaronBallman and other Clang contributors.Key elements of the upstream PR:
LateParsedAttrTypeplaceholder type to defer attribute processing until the complete struct/function definition is availableLateParsedDeclarationandLateParsedAttributeout of theParserclassLateParsedAttrListout ofParser.hintoDeclSpec.hLateParsedTypeAttribute(child ofLateParsedAttribute) for type attribute handlingLateParsedAttributevectors toDeclaratorChunk,Declarator, andDeclSpecTreeTransformto concrete types (e.g.,CountAttributedType)Execution Plan
Properly cherry-picking and transitioning to this approach requires a large change downstream, so it should be done incrementally. The tasks below are refactoring steps that can be landed independently before actually switching to the new approach.
Each task will have its own GitHub issue so they can be worked on in parallel. I am using Claude Code agents to orchestrate tasks and help prepare preliminary patches for individual sub-tasks where possible, so contributors can easily pick them up and polish.
Once the refactoring is complete, features not yet available upstream (such as function parameter support and
ended_by) can be incrementally upstreamed.Terminology
LateParsedAttrType— AST type node; placeholder in the type systemLateParsedTypeAttribute— Data structure subtypingLateParsedAttribute(not an AST node)LateParsedAttrInfo— CurrentDeclaratorChunkworkaround (to be removed)Tasks
Validation: All tasks must pass existing lit tests (
llvm-lit --filter='counted-by|bounds-safety|sized-by' clang/testandllvm-lit clang/test/BoundsSafety). Each task should be self-contained and land independently.Where possible, tasks are structured to cherry-pick parts of the upstream PR llvm#179612 independently, marked with (cherry-pick). I will split the upstream PR into smaller PRs/patches so each can be cherry-picked to the corresponding task. Tasks without this marker are downstream-only refactoring needed to prepare for the new approach.
Phase 0 — Independent Foundations (parallel)
T1: Move late parsing for parameters trigger from
ParseDeclaratorInternaltoActOnFunctionDeclarator— [BoundsSafety] Move late parsing for parameters trigger from ParseDeclaratorInternal to ActOnFunctionDeclarator #12766Currently, late parsing for parameters is triggered early during declarator parsing to prevent
FunctionDeclfrom being immediately merged with a previous declaration when the bounds attributes differ. This should change because the new approach first createsLateParsedAttrTypeplaceholders during declarator parsing and replaces them after the declaration is fully formed. Moving the trigger to after the fullFunctionDeclis created validates that late parsing for parameters works from the new site. Function redeclaration must be verified.T4: Extract diagnostics from
ConstructDynamicBoundType/ConstructCountAttributedType/ConstructDynamicRangePointerType— [BoundsSafety] Extract diagnostics from ConstructDynamicBoundType / ConstructCountAttributedType / ConstructDynamicRangePointerType #12765These CRTP
TypeVisitorclasses inSemaDeclAttr.cppwalk the type to a nested level index and construct the bounds-attributed type there, mixing type construction with diagnostics. Extract diagnostics into flat functions that run before construction; the conditions that currently trigger diagnostics become invariants enforced byassert(). Why: The new approach constructs types viaTreeTransformreplacement rather than these index-based visitors. However, the original construction is still needed for APINotes and template instantiation. Extracting diagnostics makes them callable independently from both the old and new code paths. Note: This migration is not strictly necessary for T10. Another possible path is to create a separate diagnostic function for T10 and then refactor afterwards, in case it's going to block T10.T7: Move
LateParsedAttributeoutsideParserclass; moveLateParsedAttrListtoDeclSpec.h(cherry-pick) — [BoundsSafety] Move LateParsedAttribute outside Parser class; move LateParsedAttrList to DeclSpec.h #12764Cherry-pick the structural moves from the upstream PR [BoundsSafety] Support bounds-safety attributes in type positions llvm/llvm-project#179612:
LateParsedDeclarationandLateParsedAttributeoutside theParserclass (stay in Parser.h)LateParsedAttrListfrom Parser.h to DeclSpec.hLateParsedAttributein DeclSpec.h (opaque)LateParsedAttrInfousage inDeclaratorChunkwithLateParsedAttribute*LateParsedAttrInfoT6: Handle non-late-parsed
counted_by/ended_byas type attributes inSemaType.cpp— [BoundsSafety] Handle non-late-parsed counted_by/ended_by as type attributes in SemaType.cpp #12767Currently both late-parsed and non-late-parsed paths go through
SemaDeclAttr.cpp(declaration attribute handling). Move the non-late-parsed path toSemaType.cpp(type attribute handling), while the late-parsed path remains inSemaDeclAttr.cppuntil T10 lands. Needs investigation on whether this is feasible as a standalone change. If not, absorbed into T10. Why: The new approach treats these as type attributes, not declaration attributes. Migrating the non-late-parsed path first reduces the scope of T10.Phase 1 — Data Structures + Diagnostic Reconciliation (parallel chains)
LateParsedTypeAttribute(upstream/cherry-pick) — Upstreamed as [BoundsSafety][NFC] Introduce LateParsedTypeAttribute for late-parsed type attributes llvm/llvm-project#192799. SubtypesLateParsedAttributefor late-parsed type attributes. Independent of T7. Why: The new approach needs a distinct data structure to carry type-attribute-specific information (e.g., the pointer nesting level where the placeholder was inserted) through the late parsing pipeline.LateParsedAttrType— [BoundsSafety][NFC] Introduce LateParsedAttrType AST placeholder type #13000 (upstream/cherry-pick) — This is a place holder type in AST to be replaced with a concrete type, e.g.,CountAttributedTypeduring late parsing. Depends on T8 becauseLateParsedAttrTypeembedsLateParsedTypeAttribute.handlePtrCountedByEndedByAttr(BoundsSafety-enabled pass inSemaDeclAttr.cpp) andhandleCountedByAttrField(default pass), which have overlapping diagnostic logic, including the functions they call. Also reconcile withCountArgChecker/RangeArgCheckerinSemaType.cpp. Depends on T4.Phase 2 — Diagnostic API Design
T2: Split diagnostics into DeclContext vs Type
LateParsedAttrType(or before) — invariants forLateParsedAttrTypechecked hereLateParsedAttrTypewith concreteCountAttributedTypeand friends — invariants forCountAttributedTypechecked here; argument expression diagnostics here tooWhy: The new approach has two distinct points where diagnostics run: (1) when the placeholder is inserted during type construction, and (2) when the placeholder is replaced as part of late parsing for a declaration. Each site needs a different subset of diagnostics. This split defines the API that T10 will use. Depends on T5.
T11: Refactor merged handler API for multiple callers — Refactor the merged handler so the core logic can be called from both APINotes (where
Levelis provided explicitly as a parameter) and the new late parsing logic once it's introduced (where the placeholder type already knows its position). Depends on T2.Phase 3 — Convergence
LateParsedAttrType(cherry-pick + extend)Cherry-pick the remaining core of the upstream PR [BoundsSafety] Support bounds-safety attributes in type positions llvm/llvm-project#179612:
LateParsedAttrTypeplaceholder creation duringprocessLateTypeAttrs,ProcessLateParsedTypeAttributesForFieldsto replace placeholders viaRebuildTypeWithLateParsedAttr, andLateParsedAttributevectors inDeclaratorChunk/Declarator/DeclSpec. On top of that, introduceProcessLateParsedTypeAttributesForParametersfor late parsing of attributes on function parameter types, and handleended_byin a similar manner. This can be split into smaller sub-tasks. Absorbs T6 if not standalone. Replaces T1's transitional function. Depends on T1, T4, T9, T2, and T11.