Stagehand Throws “Unknown Instruction” When Using page.extract()—Is My Schema Incorrect? #682
-
I’m trying to use page.extract() in Stagehand to pull data from a PR page, with a schema that includes fields like author, title, and date. My instruction is clear, and the fields are described using z.object() with string types, but when I run the script, Stagehand throws an “Unknown instruction” or sometimes just returns null values. I’m not sure if the issue is with how I wrote the schema, the structure of the page, or the model interpreting the instruction. How do I troubleshoot and make sure extract() works properly? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
This error usually stems from either schema formatting issues or the model not aligning your instruction with visible DOM elements. First, ensure your schema is well-formed—Stagehand relies on Zod for parsing, so even small typos (like missing .describe() or wrong type annotations) can cause silent failures. Second, confirm the target data is actually visible and not hidden inside iframes or rendered after user interaction. If the page is dynamic, add a waitForSelector or a small delay before calling extract(). You can also try simplifying the instruction to test one field at a time, which can help isolate which part is failing. For stubborn pages, using page.act() first to reveal content can improve extract() accuracy. |
Beta Was this translation helpful? Give feedback.
This error usually stems from either schema formatting issues or the model not aligning your instruction with visible DOM elements. First, ensure your schema is well-formed—Stagehand relies on Zod for parsing, so even small typos (like missing .describe() or wrong type annotations) can cause silent failures. Second, confirm the target data is actually visible and not hidden inside iframes or rendered after user interaction. If the page is dynamic, add a waitForSelector or a small delay before calling extract(). You can also try simplifying the instruction to test one field at a time, which can help isolate which part is failing. For stubborn pages, using page.act() first to reveal content…