LongCat API Platform Interface Documentation

Overview

The LongCat API Platform provides AI model proxy services exclusively for the LongCat series models, while maintaining compatibility with OpenAI and Anthropic API formats. This documentation follows standard API format conventions.

Base URLs

Production Endpoint: https://api.longcat.chat

Authentication

All API requests require authentication using an API key in the Authorization header:

Authorization: Bearer YOUR_API_KEY

Endpoints

Chat Completions

POST /openai/v1/chat/completions

Create a chat completion using OpenAI-compatible format.

Headers

Authorization: Bearer YOUR_API_KEY (required)
Content-Type: application/json

Request Body

Field	Type	Required	Description
`model`	string	Yes	Model identifier (only LongCat-Flash-Chat is supported)
`messages`	array	Yes	Array of message objects, only text inputs are allowed
`stream`	boolean	No	Whether to stream the response (default: false)
`max_tokens`	integer	No	Maximum number of tokens to generate, default to 1024
`temperature`	number	No	Sampling temperature between 0 and 1
`top_p`	number	No	Nucleus sampling parameter
`enable_thinking`	boolean	No	Enable switch parameter, defaults to false, only effective for the LongCat-Flash-Thinking model
`thinking_budget`	integer	No	Maximum length of thinking content; only effective for the LongCat-Flash-Thinking model. The minimum value is 1024 and the default value is 1024. When used together with max_tokens, make sure that the value of max_tokens is greater than thinking_budget

Message Object

Field	Type	Required	Description
`role`	string	Yes	The role of the message author. Must be one of: • `system` - Sets the behavior and context for the assistant • `user` - Messages from the human user • `assistant` - Messages from the AI assistant (for conversation history)
`content`	string	Yes	The message content. A string for simple text messages.

Example Request

{
  "model": "LongCat-Flash-Chat",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Hello, how are you?"
    }
  ],
  "stream": false,
  "max_tokens": 150,
  "temperature": 0.7
}

Example Request(Thinking)

{
  "model": "LongCat-Flash-Thinking",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Hello, how are you?"
    }
  ],
  "stream": false,
  "max_tokens": 1500,
  "temperature": 0.7,
  "enable_thinking": true,
  "thinking_budget": 1024
}

Response (Non-streaming)

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "LongCat-Flash-Chat",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! I'm doing well, thank you for asking. How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 20,
    "completion_tokens": 15,
    "total_tokens": 35
  }
}

Response (Streaming)

When stream: true, the response is returned as Server-Sent Events (SSE):

Content-Type: text/event-stream

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"LongCat-Flash-Chat","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"LongCat-Flash-Chat","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"LongCat-Flash-Chat","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}

data: [DONE]

Anthropic Messages

POST /anthropic/v1/messages

Create a message using Anthropic's Claude API format.

Headers

Authorization: Bearer YOUR_API_KEY (required)
Content-Type: application/json

Request Body

Field	Type	Required	Description
`model`	string	Yes	The Claude model to use
`messages`	array	Yes	Array of message objects
`max_tokens`	integer	No	Maximum number of tokens to generate
`stream`	boolean	No	Whether to stream the response (default: false)
`temperature`	number	No	Sampling temperature between 0 and 1
`top_p`	number	No	Nucleus sampling parameter
`system`	string	No	System message to set context
`enable_thinking`	boolean	No	Enable switch parameter, defaults to false, only effective for the LongCat-Flash-Thinking model
`thinking_budget`	integer	No	Maximum length of thinking content; only effective for the LongCat-Flash-Thinking model. The minimum value is 1024 and the default value is 1024. When used together with max_tokens, make sure that the value of max_tokens is greater than thinking_budget

Message Object

Field	Type	Required	Description
`role`	string	Yes	The role of the message author. Must be one of: • `user` - Messages from the human user • `assistant` - Messages from Claude (for conversation history) Note: System messages are passed separately via the `system` parameter
`content`	string	Yes	The message content. A string for text-only messages

Example Request

{
  "model": "LongCat-Flash-Chat",
  "max_tokens": 1000,
  "messages": [
    {
      "role": "user",
      "content": "Hello, LongCat"
    }
  ],
  "stream": false,
  "temperature": 0.7
}

Example Request(Thinking)

{
  "model": "LongCat-Flash-Thinking",
  "messages": [
    {
      "role": "user",
      "content": "Hello, how are you?"
    }
  ],
  "system": "You are a helpful assistant.",
  "stream": false,
  "max_tokens": 1500,
  "temperature": 0.7,
  "enable_thinking": true,
  "thinking_budget": 1024
}

Response (Non-streaming)

{
  "id": "msg_123",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Hello! How can I help you today?"
    }
  ],
  "model": "LongCat-Flash-Chat",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 10,
    "output_tokens": 8
  }
}

Response (Streaming)

When stream: true, the response follows Anthropic's SSE format:

Content-Type: text/event-stream

event: message_start
data: {"type": "message_start", "message": {"id": "msg_123", "type": "message", "role": "assistant", "content": [], "model": "LongCat-Flash-Chat", "stop_reason": null, "stop_sequence": null, "usage": {"input_tokens": 10, "output_tokens": 0}}}

event: content_block_start
data: {"type": "content_block_start", "index": 0, "content_block": {"type": "text", "text": ""}}

event: content_block_delta
data: {"type": "content_block_delta", "index": 0, "delta": {"type": "text_delta", "text": "Hello"}}

event: content_block_delta
data: {"type": "content_block_delta", "index": 0, "delta": {"type": "text_delta", "text": "!"}}

event: content_block_stop
data: {"type": "content_block_stop", "index": 0}

event: message_delta
data: {"type": "message_delta", "delta": {"stop_reason": "end_turn", "stop_sequence": null}, "usage": {"output_tokens": 8}}

event: message_stop
data: {"type": "message_stop"}

Error Responses

The API uses conventional HTTP response codes to indicate success or failure:

HTTP Status Codes

Status Code	Status Name	Description
`200`	OK	Request successful
`400`	Bad Request	Invalid request parameters or malformed JSON
`401`	Unauthorized	Invalid or missing API key
`403`	Forbidden	API key doesn't have permission for the requested resource
`429`	Too Many Requests	Rate limit exceeded
`500`	Internal Server Error	Server encountered an unexpected condition
`502`	Bad Gateway	Invalid response from upstream server
`503`	Service Unavailable	Server temporarily unavailable

Error Response Format

All errors return a JSON object with the following structure:

{
  "error": {
    "message": "Human-readable error description",
    "type": "error_type_identifier", 
    "code": "specific_error_code"
  }
}

Error Types and Codes

Error Type	Error Code	HTTP Status	Description
`authentication_error`	`invalid_api_key`	401	Invalid API key provided
`permission_error`	`insufficient_quota`	403	API key has insufficient quota
`invalid_request_error`	`invalid_parameter`	400	Invalid parameter value
`invalid_request_error`	`invalid_json`	400	Invalid JSON format
`rate_limit_error`	`rate_limit_exceeded`	429	Too many requests in a short period
`server_error`	`internal_error`	500	Internal server error

Example Error Responses

Invalid API Key

{
  "error": {
    "message": "Invalid API key provided",
    "type": "authentication_error",
    "code": "invalid_api_key"
  }
}

Rate Limit Exceeded

{
  "error": {
    "message": "Rate limit exceeded. Please try again in 60 seconds",
    "type": "rate_limit_error",
    "code": "rate_limit_exceeded"
  }
}

Rate Limiting

Rate limits are enforced per API key. When exceeded, you'll receive a 429 status code.

SDK Compatibility

This API is designed to be compatible with:

OpenAI Python SDK (for /openai/ endpoints)
Anthropic Python SDK (for /anthropic/ endpoints)
Any HTTP client that supports the respective API formats

Examples

Using with OpenAI Python SDK

import openai

# Configure for LongCat API
openai.api_base = "https://api.longcat.chat/openai"
openai.api_key = "your-api-key"

response = openai.ChatCompletion.create(
    model="LongCat-Flash-Chat",
    messages=[
        {"role": "user", "content": "Hello!"}
    ]
)

Using with Anthropic Python SDK

import anthropic

# Configure for LongCat API  
client = anthropic.Anthropic(
    api_key="Bearer your-api-key",
    base_url="https://api.longcat.chat"
)
default_headers={
        "Content-Type": "application/json",
        "Authorization": "Bearer your-api-key",
    }


message = client.messages.create(
    model="LongCat-Flash-Chat",
    max_tokens=150,
    messages=[
        {"role": "user", "content": "Hello, LongCat!"}
    ]
)

Using with cURL

# OpenAI-style request
curl -X POST https://api.longcat.chat/openai/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "LongCat-Flash-Chat",
    "messages": [{"role": "user", "content": "Hello!"}],
    "stream": false
  }'

# Anthropic-style request
curl -X POST https://api.longcat.chat/anthropic/v1/messages \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "LongCat-Flash-Chat",
    "max_tokens": 1000,
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

📋 Need Help? Check out our comprehensive FAQ for common questions and troubleshooting guide.