-
Notifications
You must be signed in to change notification settings - Fork 714
iframes #778
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
iframes #778
Conversation
🦋 Changeset detectedLatest commit: c17a209 The changes in this PR will be included in the next version bump. Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Summary
This PR implements comprehensive iframe support in Stagehand, enabling interaction with nested iframes across extract, observe, and act operations. The changes introduce frame-scoped element IDs, combined accessibility trees, and CDP session management for both same-process and out-of-process iframes.
- Added frame-scoped element IDs using new
EncodedId
format (${frameOrdinal}-${backendNodeId}
) for unique element identification across frames - Implemented
getAccessibilityTreeWithFrames()
to build combined accessibility trees spanning main document and nested iframes - Added
deepLocator
function inactHandlerUtils.ts
to handle XPath selectors traversing through iframes - Added three comprehensive eval tasks (
iframe_hn
,iframe_same_proc
,iframe_form_filling
) to test iframe functionality - Fixed redundant error check in
iframe_hn.ts
where the same condition is checked twice
16 file(s) reviewed, 16 comment(s)
Edit PR Review Bot Settings | Greptile
working through it, another note before approving is adding this behind |
820de04
to
622490f
Compare
ff0f811
to
7430930
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CDP Goat status
This PR was opened by the [Changesets release](https://github.com/changesets/action) GitHub action. When you're ready to do a release, you can merge this and the packages will be published to npm automatically. If you're not ready to do a release yet, that's fine, whenever you add more changesets to main, this PR will be updated. # Releases ## @browserbasehq/[email protected] ### Patch Changes - [#796](#796) [`12a99b3`](12a99b3) Thanks [@miguelg719](https://github.com/miguelg719)! - Added a experimental flag to enable the newest and most experimental features - [#807](#807) [`2451797`](2451797) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - include version number in StagehandDefaultError message - [#803](#803) [`1d631a5`](1d631a5) Thanks [@miguelg719](https://github.com/miguelg719)! - Enable session affinity for cache optimization - [#804](#804) [`9c398bb`](9c398bb) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - update operatorResponseSchema based on new openai spec - [#786](#786) [`c19ad7f`](c19ad7f) Thanks [@miguelg719](https://github.com/miguelg719)! - Handle reroute to account for rollout ## @browserbasehq/[email protected] ### Minor Changes - [#778](#778) [`df570b6`](df570b6) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - iframe support ### Patch Changes - [#809](#809) [`03ebebc`](03ebebc) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - log NoObjectGenerated error details - [#801](#801) [`1d4f0ab`](1d4f0ab) Thanks [@miguelg719](https://github.com/miguelg719)! - Default use API to true - [#798](#798) [`d86200b`](d86200b) Thanks [@miguelg719](https://github.com/miguelg719)! - Fix pino logging memory leak by reusing worker ## @browserbasehq/[email protected] ### Patch Changes - Updated dependencies \[[`12a99b3`](12a99b3), [`2451797`](2451797), [`1d631a5`](1d631a5), [`9c398bb`](9c398bb), [`c19ad7f`](c19ad7f)]: - @browserbasehq/[email protected] ## @browserbasehq/[email protected] ### Patch Changes - Updated dependencies \[[`12a99b3`](12a99b3), [`2451797`](2451797), [`1d631a5`](1d631a5), [`9c398bb`](9c398bb), [`c19ad7f`](c19ad7f)]: - @browserbasehq/[email protected] Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
why
What changed
This PR adds full support for interacting with nested
iframe
s inStagehand—across
extract
,observe
, andact
by:spanning the main document and any nested iframes.
globally unique across frames. This is done because
backendNodeId
's are not guaranteed to be unique across OOPIF's (out of process iframes).iframes (OOPIFs) and fallback to same‑process frames.
<iframe>
elements when performing Playwright actions.
“frameId-backendId” string format for element IDs.
or XPath lookups fail.
context.ts
,stagehand.ts
,stagehandErrors.ts
) and utility modules (a11y/utils.ts
,utils.ts
,handlers) to accommodate frame‑aware operations.
iframe_hn
(extract)iframe_same_proc
(act)iframe_form_filling
(act).Details
1. Frame‑Scoped Element IDs (EncodedId)
encodeWithFrameId(…)
,ordinalForFrameId(…)
, andresetFrameOrdinals()
to assign and track per‑frame ordinals.WeakMap<Page|Frame, CDPSession>
so wecan open sessions against arbitrary frames.
EncodedId = ${number}-${number}
for“frameOrdinal-backendNodeId” IDs.
TreeResult
to keyxpathMap
/idToUrl
byEncodedId
.2. Combined Accessibility Tree Across Frames
getAccessibilityTreeWithFrames()
which walks the CDP frametree, captures accessibility sub‑trees for each frame, and concatenates them
into a single “combinedTree” string plus combined URL/XPath maps keyed by
EncodedId
.formatSimplifiedTree()
to emit the newencodedId
in treelines.
buildBackendIdMaps()
to traverse nested frame DOM nodes(OOPIF and same‑process iframes) and include the frame’s
frameId
when encodingbackend IDs.
3. Deep XPath Locator for Frame Actions
Added
deepLocator(root, rawXPath)
which splits an XPath on<iframe>
steps to descend into
FrameLocator
s automatically before applying theremainder of the path.
Uses
deepLocator()
instead of a flatpage.locator(...)
so thatPlaywright actions can target elements inside iframes when
options.iframes
isset.
4. Frame‑Aware Extract & Observe Handlers
getAccessibilityTreeWithFrames()
wheniframes: true
; otherwise falls back to the legacy single‑frame tree.iframes
flag into its internal calls.getAccessibilityTreeWithFrames()
and builds itselement‑to‑XPath mapping from the combined tree.
combined tree approach.
5. Inference Schema & Element ID Changes
elementId
is now astring matching the regex
/^\d+-\d+$/
(frame‑ID plus backend‑ID) insteadof a raw number.
6. Zod URL‑Field Transformation
makeIdNumberSchema()
→makeIdStringSchema()
to emitz.string().regex(/^\d+-\d+$/)
for fields that were formerlystring().url()
,so extracted URL placeholders match the new
EncodedId
format.injectUrls()
to map both numeric and frame‑ID strings backinto real URLs once extraction is complete.
to do:
iframes: true