Posts

LlamaIndex ‘legal-kb’: Agentic Retrieval over Index v2 with retrieve, find, read, and grep Tools

Image
LlamaIndex has published legal-kb , a public reference application on GitHub. It is described as a knowledge base for legal documents, powered by LlamaIndex Index v2 (the LlamaParse Platform). The project demonstrates a pattern the team calls a Retrieval Harness for agentic retrieval. The approach differs from single-shot retrieval. Instead of one embedding search per query, an agent is given filesystem-style tools. It can then crawl a large, evolving knowledge base to solve a task. The tools mirror operations engineers already know: semantic and keyword search, regex grep, file search, and read. What is legal-kb? legal-kb is a working TanStack Start web app, not a library. You sign in, create a project, upload files, and chat with an agent. Each project is mirrored as a managed LlamaCloud Index v2. Uploaded files are parsed and indexed automatically in the background. The chat agent then queries that index live during each turn. The Retrieval Harness, in plain terms ...

Structured PDF-to-JSON: A Guide to Open-Source Extraction Models in 2026

Most enterprise data still sits inside PDFs, scans, and slide decks. Large language models and agents cannot use that data until it becomes structured JSON. Open-source document extraction has become the standard way to do that conversion on your own hardware. Two different problems hide under the phrase ‘PDF to JSON.’ The first is schema-driven extraction : you define fields, and a model fills them with values. The second is document parsing : a model reconstructs the page into structured JSON or Markdown. Most teams need one, sometimes both. Choosing the wrong category costs real time. Open weights matter here for cost and privacy. Proprietary APIs can cost thousands of dollars per million pages, and they require sending documents off-premise. Local models remove both constraints. Below are the models and toolkits worth evaluating, grouped by what they actually do. Two categories, one phrase Schema-driven extraction takes a document and a JSON schema, then...

Qwen’s Former Lead on What Hybrid Thinking Got Wrong — and Why He Now Backs Agents

Image
Junyang Lin was the technical lead of Alibaba’s Qwen project. He announced he was stepping down on March 3, 2026 . He now lists himself as an independent researcher on his personal site. In a talk titled ‘ Qwen: Towards a Generalist Model / Agent, ‘ he walks through the Qwen family. It ends on a single line: “Training models -> training agents.” He later expanded that line into an detailed post as an independent researcher. This article reads the talk and the detailed post together. What Lin’s Talk Actually Covers The talk is a tour of the Qwen model family, not a single release. It moves through QwQ-32B, Qwen2.5-Max, Qwen3, Qwen2.5-VL, and Qwen2.5-Omni. Each stop shows benchmark charts against contemporaries. The named baselines include DeepSeek-R1, Grok 3 Beta, Gemini 2.5 Pro, and OpenAI’s o-series. The Qwen3 stop carries the most detail. Lin highlights hybrid thinking modes: a thinking mode for step-by-step reasoning, and a ...

NVIDIA HORIZON: A Hands-Free Agent that Evolves Git Worktrees and Hits 100% RTL Benchmark Completion

Image
NVIDIA Research introduced HORIZON , a hands-free agent framework for hardware design. It treats hardware design as repository-level code evolution. This research team exercises the register-transfer level (RTL) instantiation. A structured Markdown harness becomes a project pack. A self-contained agent loop then evolves an isolated git worktree. It commits a version only when an executable acceptance gate passes. The research team reports 100% completion across every evaluated RTL benchmark suite. It also states plainly that agentic hardware design is not solved. What is HORIZON? Single-turn code generation has a clear limit on executable design tasks. Plausible Verilog is not enough for real hardware. Correctness depends on cycle-level behavior, reset conventions, bit widths, and simulator feedback. HORIZON hosts each design problem as a version-controlled repository, not a one-shot prompt. The only required input is a structured Markdown harness. That harness carries...