Posts

NVIDIA HORIZON: A Hands-Free Agent that Evolves Git Worktrees and Hits 100% RTL Benchmark Completion

Image
NVIDIA Research introduced HORIZON , a hands-free agent framework for hardware design. It treats hardware design as repository-level code evolution. This research team exercises the register-transfer level (RTL) instantiation. A structured Markdown harness becomes a project pack. A self-contained agent loop then evolves an isolated git worktree. It commits a version only when an executable acceptance gate passes. The research team reports 100% completion across every evaluated RTL benchmark suite. It also states plainly that agentic hardware design is not solved. What is HORIZON? Single-turn code generation has a clear limit on executable design tasks. Plausible Verilog is not enough for real hardware. Correctness depends on cycle-level behavior, reset conventions, bit widths, and simulator feedback. HORIZON hosts each design problem as a version-controlled repository, not a one-shot prompt. The only required input is a structured Markdown harness. That harness carries...

NVIDIA AI Introduces ASPIRE: A Self-Improving Robotics Framework Reaching 31% Zero-Shot on LIBERO-Pro Long Tasks

Traditional robot programming is hard to scale. It requires orchestrating multimodal perception, physical contact dynamics, diverse configurations, and execution failures by hand. Code-as-policy systems let language models compose these into executable robot programs. That makes robot behavior inspectable, editable, and debuggable. But existing robotic coding agents run in naive execution environments. They receive only coarse, task-level feedback. A failed rollout signals that the task failed, not why. The root cause can be perception, motion planning, grasping, contact dynamics, or long-horizon coordination. These systems also discard fixes once a task ends. So the agent solving its hundredth task is no more experienced than at its first. A team of researchers from NVIDIA, University of Michigan, UIUC, UC Berkeley, and CMU introduces ASPIRE (Agentic Skill Programming through Iterative Robot Exploration) . It is a continual learning system that writes and refines robot contro...

Mistral AI Releases Leanstral 1.5: An Apache-2.0 Lean 4 Code Agent Model Solving 587 of 672 PutnamBench Problems

Today, Mistral AI released Leanstral 1.5 . It is a code agent model built for Lean 4. The release targets automated theorem proving and proof engineering. Weights are open under Apache 2.0. A free API endpoint, leanstral-1-5 , is now live. Leanstral 1.5 updates the earlier Leanstral-2603 model. It belongs to the Mistral Small 4 family. What is Leanstral 1.5 Leanstral 1.5 is a code agent model for Lean 4 , a proof assistant. A proof assistant checks every logical step mechanically. Lean 4 can express objects like perfectoid spaces and properties of Rust fragments. The architecture is a mixture-of-experts, or MoE. An MoE routes each token to a few specialized sub-networks. This keeps compute low while total capacity stays large. Leanstral uses 128 experts, with 4 active per token. Total size is 119B parameters, with 6.5B activated per token. Context length is 256k tokens. Input is multimodal, accepting text and image. Output is text only. How Mistral Trained Lean...