Skip to main content
The Chat Service provides comprehensive endpoints for creating and managing conversational AI interactions. It supports multi-turn dialogues with LLM models and AI agents, file uploads, knowledge base integration, real-time streaming, and audio transcription.

Chat System Overview

Authentication

All endpoints require a valid Bearer token in the Authorization header and appropriate RBAC permissions. Required Permissions:
  • chats:read - Read chat sessions and messages
  • chats:write - Create chats and send messages
  • chats:delete - Delete chat sessions

Base URL

/api/chat

Chat Session Management

Create Chat Session

Create a new chat session for organizing conversations.
curl -X POST {{baseUrl}}/api/chat?org_id=your-org-id \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "title": "Product Design Discussion",
    "status": "ACTIVE",
    "settings": {
      "temperature": 0.7,
      "max_tokens": 2000,
      "top_p": 0.9
    }
  }'
Endpoint: POST /api/chat Query Parameters:
ParameterRequiredTypeDescription
org_idYesUUIDOrganization ID
Request Body:
FieldTypeRequiredDescription
titlestringNoChat session title (default: “New Chat”)
statusstringNoSession status: ACTIVE, ARCHIVED, DELETED
settingsobjectNoLLM generation settings
settings.temperaturenumberNoSampling temperature (0.0-1.0)
settings.max_tokensnumberNoMaximum response tokens
settings.top_pnumberNoNucleus sampling threshold

Get Chat Session

Retrieve a single chat session with all its messages and file uploads.
curl -X GET "{{baseUrl}}/api/chat?chat_id=a1b2c3d4-e5f6-7g8h-9i0j-k1l2m3n4o5p6&org_id=your-org-id" \
  -H "Authorization: Bearer YOUR_TOKEN"
Endpoint: GET /api/chat Query Parameters:
ParameterRequiredTypeDescription
chat_idYesUUIDChat session ID
org_idYesUUIDOrganization ID

List Chat Sessions

Retrieve all chat sessions for the authenticated user.
curl -X GET "{{baseUrl}}/api/chat/list?org_id=your-org-id&status=ACTIVE" \
  -H "Authorization: Bearer YOUR_TOKEN"
Endpoint: GET /api/chat/list Query Parameters:
ParameterRequiredTypeDescription
org_idYesUUIDOrganization ID
statusNostringFilter by status: ACTIVE, ARCHIVED, DELETED

Update Chat Session

Update an existing chat session’s title, status, or settings.
curl -X PUT "{{baseUrl}}/api/chat?chat_id=a1b2c3d4-e5f6-7g8h-9i0j-k1l2m3n4o5p6&org_id=your-org-id" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "title": "Updated Product Design Discussion",
    "status": "ACTIVE",
    "settings": {
      "temperature": 0.8,
      "max_tokens": 3000
    }
  }'
Endpoint: PUT /api/chat Query Parameters:
ParameterRequiredTypeDescription
chat_idYesUUIDChat session ID
org_idYesUUIDOrganization ID
Request Body:
FieldTypeRequiredDescription
titlestringNoNew chat title
statusstringNoNew status: ACTIVE, ARCHIVED, DELETED
settingsobjectNoUpdated LLM settings

Delete Chat Session

Delete a single chat session and all its messages.
curl -X DELETE "{{baseUrl}}/api/chat/delete_session?session_id=a1b2c3d4-e5f6-7g8h-9i0j-k1l2m3n4o5p6" \
  -H "Authorization: Bearer YOUR_TOKEN"
Endpoint: DELETE /api/chat/delete_session Query Parameters:
ParameterRequiredTypeDescription
session_idYesUUIDChat session ID to delete

Bulk Delete Chat Sessions

Delete multiple chat sessions in a single request.
curl -X POST "{{baseUrl}}/api/chat/bulk_delete_sessions" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "chat_ids": [
      "a1b2c3d4-e5f6-7g8h-9i0j-k1l2m3n4o5p6",
      "b2c3d4e5-f6g7-8h9i-0j1k-l2m3n4o5p6q7",
      "c3d4e5f6-g7h8-9i0j-1k2l-m3n4o5p6q7r8"
    ]
  }'
Endpoint: POST /api/chat/bulk_delete_sessions Request Body:
FieldTypeRequiredDescription
chat_idsarrayYesArray of chat session UUIDs to delete

Message Operations

Send Message

Send a message in a chat session and receive a streaming AI response. This is the primary endpoint for conversational interactions.
curl -X POST "{{baseUrl}}/api/chat/send_message?org_id=your-org-id&model_id=model-uuid&chat_id=chat-uuid" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "content": "What are the best practices for API design?",
    "thinking": false,
    "file_uploads": ["upload-uuid-1", "upload-uuid-2"],
    "knowledge_base_ids": ["kb-uuid-1", "kb-uuid-2"]
  }'
Endpoint: POST /api/chat/send_message Query Parameters:
ParameterRequiredTypeDescription
org_idYesUUIDOrganization ID
model_idConditionalUUIDLLM model ID (required if agent_id not provided)
agent_idConditionalUUIDAgent ID (required if model_id not provided)
chat_idNoUUIDChat session ID (creates new if not provided)
instruction_idNoUUIDPrompt/instruction ID to guide response
temperatureNofloatOverride temperature (0.0-1.0)
max_tokensNointOverride max response tokens
top_pNofloatOverride nucleus sampling (0.0-1.0)
Request Body:
FieldTypeRequiredDescription
contentstringYesMessage content/question
thinkingbooleanNoEnable reasoning mode (for compatible models)
file_uploadsarrayNoArray of upload UUIDs to attach
knowledge_base_idsarrayNoArray of KB UUIDs to search
Response Format: Server-Sent Events (SSE) stream with the following data formats:
TypeFormatDescription
Token{"message": "text"}Regular text token
Reasoning{"type": "reasoning", "message": "..."}Reasoning step
Image{"media": {...}}Image generation progress
Video{"media": {...}}Video generation progress
Complete{"message": "DONE"}Stream finished
Error{"error": "message"}Error occurred
Features:
  • Auto-Chat Creation: Creates new chat if chat_id not provided
  • Auto-Title Generation: Generates meaningful title for new chats
  • Knowledge Base Search: Searches specified KBs and enhances prompt
  • File Content Extraction: Extracts text from uploaded files
  • Billing Integration: Tracks token usage and credits
  • WebSocket Broadcasting: Broadcasts title updates
  • Firebase Sync: Syncs chat updates to Firebase
  • Model Feature Detection: Automatically handles text/image/video generation
  • Error Recovery: Gracefully handles LLM provider errors

File Upload Operations

Upload File

Upload a file to be attached to chat messages.
curl -X POST "{{baseUrl}}/api/chat/upload_file?org_id=your-org-id&chat_id=chat-uuid" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -F "file=@/path/to/document.pdf"
Endpoint: POST /api/chat/upload_file Query Parameters:
ParameterRequiredTypeDescription
org_idYesUUIDOrganization ID
chat_idNoUUIDChat session ID (for organized storage)
Request Body: Multipart form data with file field. Response:
FieldTypeDescription
idUUIDUpload record ID (use in send_message)
urlstringPresigned URL for file access (7 days)
Supported File Types:
  • Documents: PDF, TXT, DOCX, XLSX, CSV, MD
  • Images: JPG, PNG, GIF, WebP, SVG
  • Audio: MP3, WAV, M4A, OGG, FLAC
  • Video: MP4, WebM, MOV, AVI
Storage:
  • Files stored in GCP Cloud Storage
  • Organized by: {org_id}/{chat_id}/{filename}
  • Presigned URLs expire in 7 days
  • Content extraction for compatible formats

Audio Operations

Transcribe Audio

Convert audio files or raw audio data to text.
curl -X POST "{{baseUrl}}/api/chat/transcribe?language=en-US" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -F "file=@/path/to/audio.mp3"
Endpoint: POST /api/chat/transcribe Query Parameters:
ParameterRequiredTypeDescription
languageNostringLanguage code (default: “en-US”)
Request Body: Multipart form data with file field, OR raw bytes with content_type parameter. Supported Languages:
  • en-US - English (US)
  • en-GB - English (UK)
  • es-ES - Spanish
  • fr-FR - French
  • de-DE - German
  • it-IT - Italian
  • pt-BR - Portuguese (Brazil)
  • ja-JP - Japanese
  • ko-KR - Korean
  • zh-CN - Chinese (Simplified)
Supported Audio Formats:
  • MP3
  • WAV
  • M4A
  • OGG
  • FLAC
  • WebM

Prompt Operations

Generate Prompts

Generate AI-powered prompts based on input text with streaming response.
curl -X POST "{{baseUrl}}/api/chat/prompt" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "artificial intelligence in healthcare",
    "num_prompts": 3,
    "prompt_type": "creative",
    "model": "chat"
  }'
Endpoint: POST /api/chat/prompt Request Body:
FieldTypeRequiredDescription
textstringYesInput text to generate prompts from
num_promptsnumberNoNumber of prompts to generate (default: 1)
prompt_typestringNoType: creative, task, question, continuation (default: task)
modelstringNoModel: chat, reason (default: chat)
Response: Server-Sent Events stream with format:
data: {"content": "token"}
data: {"content": "DONE"}

Knowledge Base Operations

Get Available Knowledge Bases

Retrieve knowledge bases available for chat integration.
curl -X GET "{{baseUrl}}/api/chat/available_knowledge_bases?org_id=your-org-id" \
  -H "Authorization: Bearer YOUR_TOKEN"
Endpoint: GET /api/chat/available_knowledge_bases Query Parameters:
ParameterRequiredTypeDescription
org_idYesUUIDOrganization ID

Error Responses

Status CodeDescriptionExample
400Bad RequestMissing required parameter
401UnauthorizedInvalid or missing token
402Payment RequiredInsufficient credits
403ForbiddenInsufficient permissions
404Not FoundChat session or resource not found
500Internal Server ErrorServer-side error
Error Response Format:
{
  "detail": "Error message describing what went wrong"
}
Common Error Messages:
// Insufficient credits
{
  "detail": "Insufficient credits to use this model."
}

// Model/Agent required
{
  "detail": "Either model_id or agent_id must be provided"
}

// Chat not found
{
  "detail": "Chat session not found"
}

// LLM provider error
{
  "detail": "LLM provider openai is not configured correctly. Please contact support."
}

Implementation Notes

Streaming Response Handling

All streaming endpoints use Server-Sent Events (SSE) format:
// Browser example
const evtSource = new EventSource('/api/chat/send_message?...');

evtSource.onmessage = (event) => {
  const data = JSON.parse(event.data);

  if (data.message === 'DONE') {
    evtSource.close();
    return;
  }

  console.log(data.message);
};

evtSource.onerror = (error) => {
  console.error('Stream error:', error);
  evtSource.close();
};

Billing Integration

Every chat interaction is billed:
  1. HOLD created at message start (qty=1)
  2. Tokens counted during generation
  3. DEBIT finalized with actual token usage
  4. Usage metadata updated in chat session
Token Calculation:
weighted_total = (input_tokens * input_ratio) + (output_tokens * output_ratio)

File Content Extraction

Files are automatically processed:
  • Text files: Content extracted and appended to prompt
  • Images: Passed to vision-capable models
  • Documents: Text extraction for PDF, DOCX, etc.
  • DeepSeek models: Special handling for file content

Knowledge Base Integration

KB search happens automatically:
  1. Semantic search on user query
  2. Top N chunks retrieved (configurable, default: 10)
  3. Context injected into prompt
  4. LLM generates KB-aware response
Search Parameters:
  • limit: Number of chunks (default: 10)
  • score_threshold: Minimum similarity (default: 0.1)

Auto-Title Generation

For new chats:
  1. Wait for first AI response
  2. Generate title using LLM
  3. Update database
  4. Broadcast via WebSocket
  5. Sync to Firebase

Model Feature Detection

System automatically detects model capabilities:
  • Text models: Standard chat
  • Image models: Image generation with progress
  • Video models: Video generation with GCP upload
  • Vision models: Image analysis

Agent Integration

When using agents:
  1. Agent must be active
  2. Request sent to agent runtime
  3. Response streamed back
  4. Token usage tracked from Agno session

Rate Limiting

Rate limits enforced via feature flags:
  • daily_chat_limit: Daily message limit
  • file_upload: File upload quota
  • chat_uploads: Chat-specific upload quota

Next Steps

Explore related APIs and features: Ready to start building? Check out our Chat Concepts guide or Getting Started tutorial.