LongCat API Platform Change Log

Version: 2026-02-05

LongCat-Flash-Lite Release

LongCat-Flash-Lite is officially released. Built on a highly efficient MoE architecture (68.5B total parameters, about 3B activated per inference), the model achieves superior parameter utilization through an N-gram embedding table, specifically optimized for inference efficiency and specialized use cases.
Compared to models of similar scale, the core features are as follows:
- Superior Inference Efficiency: By mitigating I/O bottlenecks within MoE layers via the N-gram embedding table—combined with specialized cache and kernel optimizations—the model significantly reduces latency and boosts overall efficiency.
- Strong Agentic and Coding Performance: Demonstrates exceptional competitiveness in tool-use and coding proficiency, delivering high-tier performance relative to its model scale.
For calling instructions, please refer to: LongCat API Platform Interface Documentation
Open-Source Platforms:
- Hugging Face: https://huggingface.co/meituan-longcat/LongCat-Flash-Lite

Version: 2026-01-14

LongCat-Flash-Thinking-2601 Release

LongCat-Flash-Thinking-2601 is officially released. As an upgraded reasoning model, it is built on a Mixture-of-Experts (MoE) architecture with a total of 560 billion parameters. While maintaining strong competitiveness in traditional reasoning benchmarks, the model systematically enhances AI Agent reasoning capabilities through large-scale multi-environment reinforcement learning.
Compared to the LongCat-Flash-Thinking model, the core features of this upgrade are as follows:
- Extreme Robustness in Noisy Environments: Through systematic curriculum training focused on noise and uncertainty in real-world environments, the model delivers exceptional performance in agent tool calling, agent search, and tool-fusion reasoning, with a significant boost in generalization capabilities.
- Powerful Agent Capabilities: By constructing environment dependency maps with over 60 types of tools, combined with multi-environment expansion and large-scale exploration training, the model significantly improves generalization in complex, out-of-distribution (OOD) real-world scenarios.
- Advanced Deep Thinking Mode: Utilizing parallel reasoning to expand thinking breadth, combined with a summarization mechanism featuring recursive feedback to expand thinking depth, effectively solving high-difficulty problems.
For calling instructions, please refer to: LongCat API Platform Interface Documentation
Open-Source Platforms:
- Hugging Face：https://huggingface.co/meituan-longcat/LongCat-Flash-Thinking-2601
- Github：https://github.com/meituan-longcat/LongCat-Flash-Thinking-2601

Version: 2025-12-22

LongCat-Flash-Chat Update

The LongCat-Flash-Chat model has been upgraded to a new version. This update involves model capability enhancements only. The model name and API invocation method remain unchanged.
Building upon its principles of “extreme efficiency” and “lightning-fast responsiveness,” the new version of LongCat-Flash-Chat further improves its context handling and programming capabilities:
- Significantly enhanced programming performance: Deep optimizations for developer-oriented use cases, with substantial improvements in code generation, debugging, and code explanation tasks. Developers are encouraged to prioritize evaluation and testing.
- Support for a 256K ultra-long context length: The context length has doubled compared to the previous generation (128K), enabling efficient processing of large documents and long-sequence tasks.
- Comprehensively strengthened multilingual capabilities: High-quality support for nine languages, including Spanish, French, Arabic, Portuguese, Russian, and Indonesian.
- More robust Agent capabilities: Improved stability and efficiency in complex tool invocation and multi-step task execution.
For API usage details, please refer to the LongCat API Platform Interface Documentation.

Version: 2025-09-22

LongCat-Flash-Thinking Release

LongCat-Flash-Thinking is officially released and open-sourced simultaneously. It is a deep-thinking model which you can chat with for free on LongCat Chat or call via the API by specifying model=LongCat-Flash-Thinking.
LongCat Chat URL: https://longcat.chat/
For calling instructions, please refer to: LongCat API Platform Interface Documentation
Open-Source Platforms:
- Hugging Face: https://huggingface.co/meituan-longcat/LongCat-Flash-Thinking
- Github: https://github.com/meituan-longcat/LongCat-Flash-Thinking

Version: 2025-09-05

LongCat API Platform Launch

The LongCat API Platform is now live, supporting API calls to the LongCat-Flash-Chat model. You can call it by specifying model=LongCat-Flash-Chat.
For calling instructions, please refer to: LongCat API Platform Interface Documentation

Version: 2025-08-29

LongCat-Flash-Chat Release

LongCat-Flash-Chat is officially released and open-sourced simultaneously. It is a high-performance, general-purpose conversational model that you can chat with for free on LongCat Chat.
LongCat Chat URL: https://longcat.chat/
Open-Source Platforms:
- Hugging Face: https://huggingface.co/meituan-longcat/LongCat-Flash-Chat
- Github: https://github.com/meituan-longcat/LongCat-Flash-Chat