Qwen Team Open-Sources Qwen3.6-35B-A3B: A Sparse MoE Vision-Language Model with 3B Active Parameters and Agentic Coding Capabilities
The open-source AI landscape has a new entry worth paying attention to. The Qwen team at Alibaba has released Qwen3.6-35B-A3B, the first open-weight model from the Qwen3.6 generation, and it is making a compelling argument that parameter efficiency matters far more than raw model size. With 35 billion total parameters but only 3 billion activated during inference, this model delivers agentic coding performance competitive with dense models that are ten times its active size. What is a Sparse MoE Model, and Why Does it Matter Here? A Mixture of Experts (MoE) model does not run all of its parameters on every forward pass. Instead, the model routes each input token through a small subset of specialized sub-networks called ‘experts.’ The rest of the parameters sit idle. This means you can have an enormous total parameter count while keeping inference compute — and therefore inference cost and latency — proportional only to the active parameter count. Qwen3.6-35B-A3B is a Causal Language...

