Posts

Microsoft Releases Fara1.5: A Family of Browser Computer-Use Agents (4B/9B/27B) That Outperform OpenAI Operator and Gemini 2.5 Computer Use on Online-Mind2Web

Image
Microsoft Research’s AI Frontiers lab released Fara1.5. It is a family of computer-use agent (CUA) models for the browser. The release ships three sizes: Fara1.5-4B, Fara1.5-9B, and Fara1.5-27B. The models are integrated with MagenticLite, Microsoft’s sandboxed browser interface for these agents. Computer-use agents are pixel-to-action models that drive a real browser. They read screenshots and emit mouse and keyboard actions to complete tasks. Recent agent products like OpenAI’s Operator and Google’s Gemini 2.5 Computer Use sit in this category. Fara1.5-27B scores 72% task success on Online-Mind2Web. That benchmark covers 300 tasks across 136 popular sites. On the same evaluation, OpenAI’s Operator scores 58.3% and Gemini 2.5 Computer Use scores 57.3%. Yutori’s Navigator n1 reaches 64.7%, and Fara1.5-9B scores 63.4%. That nearly doubles the predecessor Fara-7B, which scored 34.1% on the same benchmark. https://ift.tt/d9X0sCe Architec...

Build Recurrent-Depth Transformers with OpenMythos for MLA, GQA, Sparse MoE, and Loop-Scaled Reasoning

In this tutorial, we explore OpenMythos by building an advanced recurrent-depth transformer workflow that runs end-to-end in Google Colab. We create both MLA and GQA model variants, compare their parameter counts, and check the stability of the recurrent injection matrix through its spectral radius. We then move from simple forward and generation tests into a synthetic compositional reasoning task, where the model learns to predict the sum of digit chains modulo a fixed value. Through this setup, we study how recurrent loops enable a single model to reuse its parameters for deeper computation. Copy Code Copied Use a different Browser import subprocess, sys def pip(*args): subprocess.run([sys.executable, "-m", "pip", "install", "-q", *args], check=False) try: import open_mythos # noqa: F401 except Exception: pip("open-mythos") try: import open_mythos # noqa: F401 except Exception: pip("git+https...

How CopilotKit Is Redefining the Agentic AI Stack in 2026

Image
For years, AI inside software meant a chat widget bolted onto the corner of an application. You typed, the model responded with text, and you manually translated that output into whatever you actually needed it to do. It was useful the way a calculator is useful: functional, but fundamentally passive. CopilotKi t, a Seattle-based startup co-founded by Atai Barkai and Uli Barkai, has spent the last two years arguing that the model is broken — and in 2026, the developer community is agreeing loudly. Give CopilotKit a on GitHub The company’s approach is straightforward: the way forward is to enable agents to live inside applications, understand what users are doing, take actions, and show useful interfaces instead of just returning long blocks of text. That approach has produced a sharp 2026 shipping cycle covering three distinct infrastructure gaps, knowledge retrieval, testing reliability, and runtime persistence with each release targeting the unglamorous, often-skipped...

Qwen Introduces Qwen3.7-Max: A Reasoning Agent Model With a 1M-Token Context Window

Image
Most AI models today are not designed for sustained, multi-step autonomous execution. Tasks like running hundreds of iterative code modifications, or chaining tool calls across hours without human intervention, require a different kind of model architecture and training focus. Alibaba’s Qwen team formally announced Qwen3.7-Max at the 2026 Alibaba Cloud Summit on May 20. Although, two preview versions of the Qwen3.7 series quietly appeared on Arena AI’s leaderboard with no press release and no official API announcement. Two Preview Models Released Simultaneously Alibaba previewed two models simultaneously: Qwen3.7-Max-Preview and Qwen3.7-Plus-Preview. They ranked 13th globally in text capabilities and 16th in vision capabilities, respectively, according to LM Arena. In Text Arena, Qwen3.7-Max-Preview ranked #13 overall, placing Alibaba as the #6 lab in text. In Vision Arena, Qwen3.7-Plus-Preview ranked #16 overall, placing Alibaba as the #5 lab in vision. The...