Agent stack task report
Can Playwright Reference find an OpenAPI spec?
Yes, with handoffs - Playwright Reference scored 78% on find an OpenAPI spec with 12 verified bench tasks.
More evidence neededVerified launch fixtureLast verified Jun 11, 2026Based on 12 verified bench tasks
Search indexing waits for at least 20 evaluable bench tasks in this task family.
Evidence basis
What the verified traces show
67%
Finished
8 of 12 tasks
17%
Human handoff
2 of 12 tasks
8%
Partial
1 of 12 tasks
8%
Blocked
1 of 12 tasks
Playwright Reference full bench result
Per-category scores, task-family coverage, and trace summary for the stack.
Find an OpenAPI spec task hub
Site-level evidence for this task across the public CrawlDex corpus.
Methodology
How CrawlDex scores outcomes, confidence, freshness, and reputation-gated evidence.