Posts

Moonshot AI Launches Kimi Work, a Local Desktop Agent Reportedly Running on Kimi K2.6 With a 300-Sub-Agent Agent Swarm

Moonshot AI has introduced Kimi Work, an AI agent that runs on your own desktop. The Beijing-based AI entity announced it this week along with downloads for macOS and Windows. Kimi Work reads local files, drives your real browser, and runs scheduled tasks. It targets knowledge workers whose bottleneck is access to files and live sessions. Most agent tools of the past two years ran in the cloud. You type a goal, a remote server spins up a sandbox, and a hosted browser acts. Kimi Work runs locally instead, reaching files and sessions you already use. What is Kimi Work? Kimi Work is a downloadable application, not a web chat. You give it goals in plain language, and it acts on your machine. Independent community mentions report that it runs on Kimi K2.6, Moonshot’s flagship model. K2.6 is an open-weight Mixture-of-Experts model released on April 20, 2026. It activates about 32 billion parameters per token. It carries a 256K-token context window for long, multi-step ...

Zyphra Release Zamba2-VL: Hybrid Mamba2–Transformer Vision-Language Models That Cut Time-to-First-Token by About an Order of Magnitude

Image
Zyphra has released Zamba2-VL, a family of open vision-language models. The release covers three sizes: 1.2B, 2.7B, and 7B parameters. Each model is built on the Zamba2 hybrid SSM–Transformer backbone. Vision-language models (VLMs) read images and text together. They answer questions about charts, documents, and photos. Most open VLMs use a dense Transformer as the language model. Zamba2-VL replaces that with a hybrid state-space design. The goal is competitive accuracy at lower latency. What is Zamba2-VL Zamba2-VL follows the now-standard LLaVA-style VLM template. A pre-trained vision encoder turns image patches into features. A lightweight MLP adapter projects those features into the language model’s space. The language model then reads an interleaved sequence of vision and text tokens. The models support single and multi-image understanding and grounding. Zyphra pairs each Zamba2 backbone with the Vision Transformer from Qwen2.5-VL. That encoder was chosen for ...

A Coding Implementation on MONAI for End-to-End 3D Spleen Segmentation Using UNet on Medical CT Volumes

In this tutorial, we build an end-to-end 3D medical image segmentation pipeline using MONAI to segment the spleen on the Medical Segmentation Decathlon Task09 dataset. We work with volumetric CT scans, apply medical imaging transformations such as orientation alignment, voxel-spacing normalization, intensity windowing, foreground cropping, and patch-based sampling, and then train a 3D UNet model for binary organ segmentation. We also use mixed precision training, DiceCE loss, sliding-window inference, Dice-based validation, and qualitative visualization to understand how the model learns and how its predictions compare with the ground-truth masks. Also, we move from raw medical volumes to a complete train–validate–visualize segmentation system. Copy Code Copied Use a different Browser !pip install -q "monai[nibabel,tqdm,matplotlib]==1.5.2" 2>/dev/null import os, time, glob, tempfile, warnings import numpy as np import torch import matplotlib.pyplot as plt from torch...

Perplexity Moves Deep Research Into Computer, Routing Research Subtasks Across 20+ Frontier Models For Reports, Decks, And Dashboards

Perplexity has moved Deep Research into Computer, its multi-model orchestration system. The upgrade improves accuracy, depth of analysis, and citation quality. Deep Research now breaks hard questions into subtasks and routes them across 20+ frontier models. It returns work-ready reports, decks, and dashboards, all inside Computer. Deep Research in Computer Deep Research is a mode that runs many searches, reads sources, and writes a cited report. The new version lives inside Perplexity Computer, which launched in late February 2026. Computer is a cloud system that coordinates up to 20 AI models in one workflow. It is model-agnostic, with Opus 4.6 as its core reasoning engine. Sub-agents handle specialized work, such as Gemini for deep research tasks. Deep Research in Computer is built on two parts: the Agent Search SDK and Search as Code. With one complex question, it builds a research plan automatically. It then finds primary sources across hundreds of sites and cites ever...