Microsoft Research Releases Webwright: A Terminal-Native Web Agent Framework That Scores 60.1% on Odysseys, Up from Base GPT-5.4’s 33.5%
Most web agents today drive a browser one action at a time. The model receives the current page state — as a screenshot or DOM text — and predicts the next click, keypress, or scroll. This action-at-a-time design made sense when language models had limited reasoning ability. As models have become more capable at writing and debugging code, that rigid loop has become a constraint rather than a structure that helps. Microsoft Research’s AI Frontiers lab built a different approach. Their new open-source framework, Webwright , gives the agent a terminal instead of a stateful browser session. The agent writes Playwright code to control browsers, runs bash commands, inspects logs, and iteratively refines scripts. Playwright is an open-source browser automation library, also from Microsoft, that supports programmatic control of Chromium, Firefox, and WebKit browsers. What Webwright Does Differently Webwright separates the agent from the browser and treats the browser as som...
