Posts

NVIDIA AI Introduces ASPIRE: A Self-Improving Robotics Framework Reaching 31% Zero-Shot on LIBERO-Pro Long Tasks

Traditional robot programming is hard to scale. It requires orchestrating multimodal perception, physical contact dynamics, diverse configurations, and execution failures by hand. Code-as-policy systems let language models compose these into executable robot programs. That makes robot behavior inspectable, editable, and debuggable. But existing robotic coding agents run in naive execution environments. They receive only coarse, task-level feedback. A failed rollout signals that the task failed, not why. The root cause can be perception, motion planning, grasping, contact dynamics, or long-horizon coordination. These systems also discard fixes once a task ends. So the agent solving its hundredth task is no more experienced than at its first. A team of researchers from NVIDIA, University of Michigan, UIUC, UC Berkeley, and CMU introduces ASPIRE (Agentic Skill Programming through Iterative Robot Exploration) . It is a continual learning system that writes and refines robot contro...

Mistral AI Releases Leanstral 1.5: An Apache-2.0 Lean 4 Code Agent Model Solving 587 of 672 PutnamBench Problems

Today, Mistral AI released Leanstral 1.5 . It is a code agent model built for Lean 4. The release targets automated theorem proving and proof engineering. Weights are open under Apache 2.0. A free API endpoint, leanstral-1-5 , is now live. Leanstral 1.5 updates the earlier Leanstral-2603 model. It belongs to the Mistral Small 4 family. What is Leanstral 1.5 Leanstral 1.5 is a code agent model for Lean 4 , a proof assistant. A proof assistant checks every logical step mechanically. Lean 4 can express objects like perfectoid spaces and properties of Rust fragments. The architecture is a mixture-of-experts, or MoE. An MoE routes each token to a few specialized sub-networks. This keeps compute low while total capacity stays large. Leanstral uses 128 experts, with 4 active per token. Total size is 119B parameters, with 6.5B activated per token. Context length is 256k tokens. Input is multimodal, accepting text and image. Output is text only. How Mistral Trained Lean...

Designing a Schema-Guided Invoice Intelligence Pipeline with lift-pdf for Accounts-Payable Extraction, Validation, and Ledger Generation

Image
In this tutorial , we build an end-to-end accounts-payable extraction pipeline with lift-pdf , using synthetic invoice PDFs as controlled test documents and a structured JSON schema as the target output format. Instead of treating invoice parsing as a simple OCR task, we frame it as schema-guided document understanding: we generate realistic invoices, define fields such as vendor identity, billing party, PO number, line items, tax, total amount, balance due, and payment status, and then ask the model to extract those values directly from the rendered PDF layout. We also include practical extraction traps that appear in real finance workflows, such as distinguishing bill-to from ship-to, separating subtotal from after-tax total, returning null for absent values, and correctly marking partially paid invoices as unpaid when a balance remains. Through GPU-aware model loading, optional 4-bit quantization, PDF generation and extraction, scoring, and ledger construction , we turn this tutor...