Posts

Best Authentication Platforms for AI Agents and MCP Servers in 2026

The Model Context Protocol has moved from Anthropic’s internal experiment to a de facto industry standard at a speed few integration protocols have matched. Since its launch in November 2024, MCP has grown explosively: OpenAI adopted it in March 2025, Microsoft announced support in Copilot Studio in March 2025, and by late 2025 combined Python and TypeScript SDK downloads had crossed 97 million monthly . In December 2025, Anthropic donated MCP to the Agentic AI Foundation under the Linux Foundation. Gartner projects that up to 40% of enterprise applications will include integrated task-specific AI agents by the end of 2026, up from less than 5% today. That growth has made authentication the central unsolved problem of the agentic stack. When AI agents do nothing but answer questions, auth is a conversation-level concern. When they read emails, update CRMs, write to databases, and call external APIs autonomously, auth becomes infrastructure — and the blast radius of getting it...

WorkOS Releases auth.md: An Open Agent Registration Protocol Built on OAuth Standards

For years , authentication on the web followed one design assumption: a human sits behind a browser. Click a button. Fill out a form. Verify an email. Copy an API key and paste it somewhere else. That model does not work when the user is delegating work to an agent. Agents are already writing code, opening pull requests, triaging tickets, querying systems, and updating records. But most products still have no real way for an agent to register. The workaround — giving an agent a raw API key or session token — produces credentials that are unscoped, hard to audit per session, and impossible to revoke selectively. WorkOS is proposing a structured alternative: auth.md , an open protocol for agent registration. What is auth.md? auth.md is a small Markdown file an application publishes at a well-known location — typically https://service.com/auth.md . The file tells agents how to register with that service: which flows are supported, which scopes exist, and how credentials are i...

Build a Complete Langfuse Observability and Evaluation Pipeline for Tracing, Prompt Management, Scoring, and Experiments

Image
In this tutorial, we implement the Langfuse (an open-source LLM engineering platform) pipeline for tracing, prompt management, scoring, datasets, and experiments. We build a complete workflow that works with either a real OpenAI key or a deterministic mock LLM, so we can understand every major Langfuse feature without depending on paid model access. We start by setting up credentials and connecting to Langfuse. We trace simple function calls, instrument a small RAG pipeline, manage prompts centrally, attach evaluation scores, and run dataset-based experiments. Also, we see how Langfuse helps us observe, evaluate, and improve LLM applications in a structured and production-ready way. Copy Code Copied Use a different Browser import subprocess, sys def pip_install(*pkgs): subprocess.run([sys.executable, "-m", "pip", "install", "-qU", *pkgs], check=True) pip_install("langfuse", "openai") import os from getpass import getpa...

StepFun Releases StepAudio 2.5 Realtime: An End-to-End Voice Model with Roleplay-Specific RLHF and Paralinguistic Comprehension

Image
StepFun, the Shanghai-based AI lab, released StepAudio 2.5 Realtime. It is an end-to-end real-time speech large language model with fully customizable persona capabilities. StepAudio 2.5 Realtime is a voice model that operates in real time. Unlike pipeline-based systems that separate speech recognition, reasoning, and synthesis into sequential steps, this is an end-to-end model. Audio goes in and audio comes out through a single unified system. The model supports Chinese and English. It connects via a WebSocket API. The endpoint is wss://api.stepfun.com/v1/realtime using the model string step-2.5-realtime . The Three Technical Pillars StepFun research team describes three core architectural innovations behind the model: 1. Million-Scale Persona Data Augmentation Starting from 10,000+ high-quality natively authored personas, StepFun applied algorithmic augmentation to build a million-scale persona feature matrix. This was combined with millions of real-world co...

Microsoft Research Releases Webwright: A Terminal-Native Web Agent Framework That Scores 60.1% on Odysseys, Up from Base GPT-5.4’s 33.5%

Image
Most web agents today drive a browser one action at a time. The model receives the current page state — as a screenshot or DOM text — and predicts the next click, keypress, or scroll. This action-at-a-time design made sense when language models had limited reasoning ability. As models have become more capable at writing and debugging code, that rigid loop has become a constraint rather than a structure that helps. Microsoft Research’s AI Frontiers lab built a different approach. Their new open-source framework, Webwright , gives the agent a terminal instead of a stateful browser session. The agent writes Playwright code to control browsers, runs bash commands, inspects logs, and iteratively refines scripts. Playwright is an open-source browser automation library, also from Microsoft, that supports programmatic control of Chromium, Firefox, and WebKit browsers. What Webwright Does Differently Webwright separates the agent from the browser and treats the browser as som...