Posts

Genesis AI Releases Nyx, Quadrants, and Genesis World 1.0 Physics Platform for Scalable Robotics Foundation Model Evaluation

Image
Genesis AI released Genesis World 1.0 . The platform consists of four components: the Genesis World physics engine, Nyx (a real-time path-traced renderer), Quadrants (a Python-to-GPU compiler), and a simulation interface. It is designed to accelerate robotics foundation model development through simulation-based evaluation. Robotics model development has two bottlenecks: data and iteration speed. The field has focused heavily on data. Genesis AI argues the slower, less-discussed bottleneck is the model development cycle itself — specifically, how fast teams can evaluate candidate policies and compare model checkpoints. What Problem Does This Solve? A typical policy evaluation at Genesis spans hundreds of tasks with hundreds of episodes each. Running that in the real world requires more than 200 hours of continuous robot operation with one operator and one robot station — for a single evaluation pass. Statistically meaningful comparisons across checkpoints require many such...

Hermes Agent Ships Tool Search for MCP: Anthropic Evals Show 49% to 74% Accuracy Gain on Opus 4

Image
Nous Research’s open-source Hermes Agent now ships a Tool Search feature. It directly addresses a growing bottleneck in AI agent systems: too many MCP tools filling up the context window. In this explainer article, we will breaks down what Tool Search does, how it works, and when to use it. The Problem: MCP Tools Are Eating Your Context Window When you connect multiple MCP (Model Context Protocol) servers to an AI agent, every tool’s JSON schema gets sent to the model on every turn. This happens even if the model only needs one or two tools for a given task. Real-world deployments feel this immediately. A Hermes deployment with five MCP servers and 34 tools shows average prompt sizes of 45,000 tokens per turn. Roughly 22,000 of those tokens — around 50% — are tool schema overhead alone. Anthropic’s own engineering data shows tool definitions can consume 134,000 tokens before optimization. Tool Attention measures the “MCP Tools Tax” at 15...

How to Use AgentTrove: Streaming 1.7M Agentic Traces and Building a Clean ShareGPT SFT Dataset in Python

Image
In this tutorial, we explore AgentTrove , one of the largest open-source collections of agentic interaction traces, and learn how to work with it efficiently. Instead of downloading the full dataset, we use streaming to inspect rows, detect the conversation schema, normalize agent turns, and understand how user, assistant, system, and tool messages are structured. We also build utilities to parse command-style assistant outputs, render complete trajectories in a readable format, and study how agents interact with tools across different tasks. Also, we create a lightweight analytical workflow that samples thousands of traces, converts them into a DataFrame, summarizes turn-level statistics, visualizes important dataset patterns, and exports successful traces into a clean ShareGPT-style JSONL format for supervised fine-tuning. Copy Code Copied Use a different Browser !pip -q install "datasets>=2.19" pandas matplotlib pyarrow huggingface_hub import itertools, json, coll...

NVIDIA Introduces X-Token: Projection-Guided Cross-Tokenizer KD That Outperforms GOLD by +3.82 Average Points on Llama-3.2-1B

Image
Knowledge distillation (KD) transfers “dark knowledge” from a large teacher model to a smaller student. The student learns from the teacher’s full output probability distribution over tokens, not just correct answers. This is done via per-position Kullback–Leibler (KL) divergence over next-token probability distributions. This formulation requires a shared tokenizer. A practitioner committed to Llama-3.2-1B cannot leverage stronger teachers with incompatible tokenizers — such as Phi-4-mini or Qwen3-4B — because token positions do not correspond across vocabularies. This also prevents multi-teacher distillation across tokenizer families. NVIDIA researchers introduced X-Token , a logit-distribution-based method for cross-tokenizer KD (Knowledge distillation). It operates as a drop-in replacement for the standard KD loss, requiring no auxiliary trainable components and no architectural changes. The Problem X-Token is Solving Two prior approaches dominate ...