LongCat API Platform Interface Documentation
Overview
The LongCat API Platform provides AI model proxy services exclusively for the LongCat series models, while maintaining compatibility with OpenAI and Anthropic API formats. This documentation follows standard API format conventions.
Base URLs
Production Endpoint: https://api.longcat.chat
Authentication
All API requests require authentication using an API key in the Authorization header:
Authorization: Bearer YOUR_API_KEY
Endpoints
Chat Completions
POST /openai/v1/chat/completions
Create a chat completion using OpenAI-compatible format.
Headers
Authorization: Bearer YOUR_API_KEY(required)Content-Type: application/json
Request Body
| Field | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Model identifier (only LongCat-Flash-Chat is supported) |
messages | array | Yes | Array of message objects, only text inputs are allowed |
stream | boolean | No | Whether to stream the response (default: false) |
max_tokens | integer | No | Maximum number of tokens to generate, default to 1024 |
temperature | number | No | Sampling temperature between 0 and 1 |
top_p | number | No | Nucleus sampling parameter |
enable_thinking | boolean | No | Enable switch parameter, defaults to false, only effective for the LongCat-Flash-Thinking model |
thinking_budget | integer | No | Maximum length of thinking content; only effective for the LongCat-Flash-Thinking model. The minimum value is 1024 and the default value is 1024. When used together with max_tokens, make sure that the value of max_tokens is greater than thinking_budget |
Message Object
| Field | Type | Required | Description |
|---|---|---|---|
role | string | Yes | The role of the message author. Must be one of: • system - Sets the behavior and context for the assistant • user - Messages from the human user • assistant - Messages from the AI assistant (for conversation history) |
content | string | Yes | The message content. A string for simple text messages. |
Example Request
{
"model": "LongCat-Flash-Chat",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello, how are you?"
}
],
"stream": false,
"max_tokens": 150,
"temperature": 0.7
}
Example Request(Thinking)
{
"model": "LongCat-Flash-Thinking",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello, how are you?"
}
],
"stream": false,
"max_tokens": 1500,
"temperature": 0.7,
"enable_thinking": true,
"thinking_budget": 1024
}
Response (Non-streaming)
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "LongCat-Flash-Chat",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm doing well, thank you for asking. How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 20,
"completion_tokens": 15,
"total_tokens": 35
}
}
Response (Streaming)
When stream: true, the response is returned as Server-Sent Events (SSE):
Content-Type: text/event-stream
data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"LongCat-Flash-Chat","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}
data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"LongCat-Flash-Chat","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}
data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"LongCat-Flash-Chat","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}
data: [DONE]
Anthropic Messages
POST /anthropic/v1/messages
Create a message using Anthropic's Claude API format.
Headers
Authorization: Bearer YOUR_API_KEY(required)Content-Type: application/json
Request Body
| Field | Type | Required | Description |
|---|---|---|---|
model | string | Yes | The Claude model to use |
messages | array | Yes | Array of message objects |
max_tokens | integer | No | Maximum number of tokens to generate |
stream | boolean | No | Whether to stream the response (default: false) |
temperature | number | No | Sampling temperature between 0 and 1 |
top_p | number | No | Nucleus sampling parameter |
system | string | No | System message to set context |
enable_thinking | boolean | No | Enable switch parameter, defaults to false, only effective for the LongCat-Flash-Thinking model |
thinking_budget | integer | No | Maximum length of thinking content; only effective for the LongCat-Flash-Thinking model. The minimum value is 1024 and the default value is 1024. When used together with max_tokens, make sure that the value of max_tokens is greater than thinking_budget |
Message Object
| Field | Type | Required | Description |
|---|---|---|---|
role | string | Yes | The role of the message author. Must be one of: • user - Messages from the human user • assistant - Messages from Claude (for conversation history) Note: System messages are passed separately via the system parameter |
content | string | Yes | The message content. A string for text-only messages |
Example Request
{
"model": "LongCat-Flash-Chat",
"max_tokens": 1000,
"messages": [
{
"role": "user",
"content": "Hello, LongCat"
}
],
"stream": false,
"temperature": 0.7
}
Example Request(Thinking)
{
"model": "LongCat-Flash-Thinking",
"messages": [
{
"role": "user",
"content": "Hello, how are you?"
}
],
"system": "You are a helpful assistant.",
"stream": false,
"max_tokens": 1500,
"temperature": 0.7,
"enable_thinking": true,
"thinking_budget": 1024
}
Response (Non-streaming)
{
"id": "msg_123",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Hello! How can I help you today?"
}
],
"model": "LongCat-Flash-Chat",
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 10,
"output_tokens": 8
}
}
Response (Streaming)
When stream: true, the response follows Anthropic's SSE format:
Content-Type: text/event-stream
event: message_start
data: {"type": "message_start", "message": {"id": "msg_123", "type": "message", "role": "assistant", "content": [], "model": "LongCat-Flash-Chat", "stop_reason": null, "stop_sequence": null, "usage": {"input_tokens": 10, "output_tokens": 0}}}
event: content_block_start
data: {"type": "content_block_start", "index": 0, "content_block": {"type": "text", "text": ""}}
event: content_block_delta
data: {"type": "content_block_delta", "index": 0, "delta": {"type": "text_delta", "text": "Hello"}}
event: content_block_delta
data: {"type": "content_block_delta", "index": 0, "delta": {"type": "text_delta", "text": "!"}}
event: content_block_stop
data: {"type": "content_block_stop", "index": 0}
event: message_delta
data: {"type": "message_delta", "delta": {"stop_reason": "end_turn", "stop_sequence": null}, "usage": {"output_tokens": 8}}
event: message_stop
data: {"type": "message_stop"}
Error Responses
The API uses conventional HTTP response codes to indicate success or failure:
HTTP Status Codes
| Status Code | Status Name | Description |
|---|---|---|
200 | OK | Request successful |
400 | Bad Request | Invalid request parameters or malformed JSON |
401 | Unauthorized | Invalid or missing API key |
403 | Forbidden | API key doesn't have permission for the requested resource |
429 | Too Many Requests | Rate limit exceeded |
500 | Internal Server Error | Server encountered an unexpected condition |
502 | Bad Gateway | Invalid response from upstream server |
503 | Service Unavailable | Server temporarily unavailable |
Error Response Format
All errors return a JSON object with the following structure:
{
"error": {
"message": "Human-readable error description",
"type": "error_type_identifier",
"code": "specific_error_code"
}
}
Error Types and Codes
| Error Type | Error Code | HTTP Status | Description |
|---|---|---|---|
authentication_error | invalid_api_key | 401 | Invalid API key provided |
permission_error | insufficient_quota | 403 | API key has insufficient quota |
invalid_request_error | invalid_parameter | 400 | Invalid parameter value |
invalid_request_error | invalid_json | 400 | Invalid JSON format |
rate_limit_error | rate_limit_exceeded | 429 | Too many requests in a short period |
server_error | internal_error | 500 | Internal server error |
Example Error Responses
Invalid API Key
{
"error": {
"message": "Invalid API key provided",
"type": "authentication_error",
"code": "invalid_api_key"
}
}
Rate Limit Exceeded
{
"error": {
"message": "Rate limit exceeded. Please try again in 60 seconds",
"type": "rate_limit_error",
"code": "rate_limit_exceeded"
}
}
Rate Limiting
Rate limits are enforced per API key. When exceeded, you'll receive a 429 status code.
SDK Compatibility
This API is designed to be compatible with:
- OpenAI Python SDK (for
/openai/endpoints) - Anthropic Python SDK (for
/anthropic/endpoints) - Any HTTP client that supports the respective API formats
Examples
Using with OpenAI Python SDK
import openai
# Configure for LongCat API
openai.api_base = "https://api.longcat.chat/openai"
openai.api_key = "your-api-key"
response = openai.ChatCompletion.create(
model="LongCat-Flash-Chat",
messages=[
{"role": "user", "content": "Hello!"}
]
)
Using with Anthropic Python SDK
import anthropic
# Configure for LongCat API
client = anthropic.Anthropic(
api_key="Bearer your-api-key",
base_url="https://api.longcat.chat"
)
default_headers={
"Content-Type": "application/json",
"Authorization": "Bearer your-api-key",
}
message = client.messages.create(
model="LongCat-Flash-Chat",
max_tokens=150,
messages=[
{"role": "user", "content": "Hello, LongCat!"}
]
)
Using with cURL
# OpenAI-style request
curl -X POST https://api.longcat.chat/openai/v1/chat/completions \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "LongCat-Flash-Chat",
"messages": [{"role": "user", "content": "Hello!"}],
"stream": false
}'
# Anthropic-style request
curl -X POST https://api.longcat.chat/anthropic/v1/messages \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "LongCat-Flash-Chat",
"max_tokens": 1000,
"messages": [{"role": "user", "content": "Hello!"}]
}'
📋 Need Help? Check out our comprehensive FAQ for common questions and troubleshooting guide.