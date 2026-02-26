+ ↺ − 16 px

Alibaba has launched its Qwen3.5 Medium Model series, offering powerful large language models (LLMs) that run efficiently on local computers. The release includes four models, three of which are open source under the Apache 2.0 license, and one proprietary Flash model accessible via Alibaba Cloud.

These models match or surpass leading U.S. alternatives like OpenAI’s GPT-5-mini and Anthropic’s Claude Sonnet 4.5 in benchmarks for knowledge and visual reasoning. Notably, the flagship Qwen3.5-35B-A3B can handle over 1 million tokens on consumer-grade GPUs with 32GB VRAM, thanks to near-lossless 4-bit quantization and a hybrid Gated Delta + Mixture-of-Experts architecture, News.Az reports, citing foreign media.

The series also introduces “Thinking Mode,” which lets models reason internally before generating responses, and supports agentic tool calling for enterprise applications. Pricing for the hosted Qwen3.5-Flash API is highly competitive, starting at $0.10 per million input tokens.

By combining high performance, low compute requirements, and enterprise-ready features, Qwen3.5 opens the door for local AI deployment at companies previously reliant on expensive cloud-based LLMs, while maintaining data privacy and cost efficiency.

News.Az