Chat Service API

The Chat Service provides comprehensive endpoints for creating and managing conversational AI interactions. It supports multi-turn dialogues with LLM models and AI agents, file uploads, knowledge base integration, real-time streaming, and audio transcription.

Chat System Overview

Authentication

All endpoints require a valid Bearer token in the Authorization header and appropriate RBAC permissions. Required Permissions:

chats:read - Read chat sessions and messages
chats:write - Create chats and send messages
chats:delete - Delete chat sessions

Base URL

/api/chat

Chat Session Management

Create Chat Session

Create a new chat session for organizing conversations.

curl -X POST {{baseUrl}}/api/chat?org_id=your-org-id \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "title": "Product Design Discussion",
    "status": "ACTIVE",
    "settings": {
      "temperature": 0.7,
      "max_tokens": 2000,
      "top_p": 0.9
    }
  }'

Endpoint: POST /api/chat Query Parameters:

Parameter	Required	Type	Description
`org_id`	Yes	UUID	Organization ID

Request Body:

Field	Type	Required	Description
`title`	string	No	Chat session title (default: “New Chat”)
`status`	string	No	Session status: ACTIVE, ARCHIVED, DELETED
`settings`	object	No	LLM generation settings
`settings.temperature`	number	No	Sampling temperature (0.0-1.0)
`settings.max_tokens`	number	No	Maximum response tokens
`settings.top_p`	number	No	Nucleus sampling threshold

Get Chat Session

Retrieve a single chat session with all its messages and file uploads.

curl -X GET "{{baseUrl}}/api/chat?chat_id=a1b2c3d4-e5f6-7g8h-9i0j-k1l2m3n4o5p6&org_id=your-org-id" \
  -H "Authorization: Bearer YOUR_TOKEN"

Endpoint: GET /api/chat Query Parameters:

Parameter	Required	Type	Description
`chat_id`	Yes	UUID	Chat session ID
`org_id`	Yes	UUID	Organization ID

List Chat Sessions

Retrieve all chat sessions for the authenticated user.

curl -X GET "{{baseUrl}}/api/chat/list?org_id=your-org-id&status=ACTIVE" \
  -H "Authorization: Bearer YOUR_TOKEN"

Endpoint: GET /api/chat/list Query Parameters:

Parameter	Required	Type	Description
`org_id`	Yes	UUID	Organization ID
`status`	No	string	Filter by status: ACTIVE, ARCHIVED, DELETED

Update Chat Session

Update an existing chat session’s title, status, or settings.

curl -X PUT "{{baseUrl}}/api/chat?chat_id=a1b2c3d4-e5f6-7g8h-9i0j-k1l2m3n4o5p6&org_id=your-org-id" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "title": "Updated Product Design Discussion",
    "status": "ACTIVE",
    "settings": {
      "temperature": 0.8,
      "max_tokens": 3000
    }
  }'

Endpoint: PUT /api/chat Query Parameters:

Parameter	Required	Type	Description
`chat_id`	Yes	UUID	Chat session ID
`org_id`	Yes	UUID	Organization ID

Request Body:

Field	Type	Required	Description
`title`	string	No	New chat title
`status`	string	No	New status: ACTIVE, ARCHIVED, DELETED
`settings`	object	No	Updated LLM settings

Delete Chat Session

Delete a single chat session and all its messages.

curl -X DELETE "{{baseUrl}}/api/chat/delete_session?session_id=a1b2c3d4-e5f6-7g8h-9i0j-k1l2m3n4o5p6" \
  -H "Authorization: Bearer YOUR_TOKEN"

Endpoint: DELETE /api/chat/delete_session Query Parameters:

Parameter	Required	Type	Description
`session_id`	Yes	UUID	Chat session ID to delete

Bulk Delete Chat Sessions

Delete multiple chat sessions in a single request.

curl -X POST "{{baseUrl}}/api/chat/bulk_delete_sessions" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "chat_ids": [
      "a1b2c3d4-e5f6-7g8h-9i0j-k1l2m3n4o5p6",
      "b2c3d4e5-f6g7-8h9i-0j1k-l2m3n4o5p6q7",
      "c3d4e5f6-g7h8-9i0j-1k2l-m3n4o5p6q7r8"
    ]
  }'

Endpoint: POST /api/chat/bulk_delete_sessions Request Body:

Field	Type	Required	Description
`chat_ids`	array	Yes	Array of chat session UUIDs to delete

Message Operations

Send Message

Send a message in a chat session and receive a streaming AI response. This is the primary endpoint for conversational interactions.

curl -X POST "{{baseUrl}}/api/chat/send_message?org_id=your-org-id&model_id=model-uuid&chat_id=chat-uuid" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "content": "What are the best practices for API design?",
    "thinking": false,
    "file_uploads": ["upload-uuid-1", "upload-uuid-2"],
    "knowledge_base_ids": ["kb-uuid-1", "kb-uuid-2"]
  }'

Endpoint: POST /api/chat/send_message Query Parameters:

Parameter	Required	Type	Description
`org_id`	Yes	UUID	Organization ID
`model_id`	Conditional	UUID	LLM model ID (required if agent_id not provided)
`agent_id`	Conditional	UUID	Agent ID (required if model_id not provided)
`chat_id`	No	UUID	Chat session ID (creates new if not provided)
`instruction_id`	No	UUID	Prompt/instruction ID to guide response
`temperature`	No	float	Override temperature (0.0-1.0)
`max_tokens`	No	int	Override max response tokens
`top_p`	No	float	Override nucleus sampling (0.0-1.0)

Request Body:

Field	Type	Required	Description
`content`	string	Yes	Message content/question
`thinking`	boolean	No	Enable reasoning mode (for compatible models)
`file_uploads`	array	No	Array of upload UUIDs to attach
`knowledge_base_ids`	array	No	Array of KB UUIDs to search

Response Format: Server-Sent Events (SSE) stream with the following data formats:

Type	Format	Description
Token	`{"message": "text"}`	Regular text token
Reasoning	`{"type": "reasoning", "message": "..."}`	Reasoning step
Image	`{"media": {...}}`	Image generation progress
Video	`{"media": {...}}`	Video generation progress
Complete	`{"message": "DONE"}`	Stream finished
Error	`{"error": "message"}`	Error occurred

Features:

Auto-Chat Creation: Creates new chat if chat_id not provided
Auto-Title Generation: Generates meaningful title for new chats
Knowledge Base Search: Searches specified KBs and enhances prompt
File Content Extraction: Extracts text from uploaded files
Billing Integration: Tracks token usage and credits
WebSocket Broadcasting: Broadcasts title updates
Firebase Sync: Syncs chat updates to Firebase
Model Feature Detection: Automatically handles text/image/video generation
Error Recovery: Gracefully handles LLM provider errors

File Upload Operations

Upload File

Upload a file to be attached to chat messages.

curl -X POST "{{baseUrl}}/api/chat/upload_file?org_id=your-org-id&chat_id=chat-uuid" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -F "file=@/path/to/document.pdf"

Endpoint: POST /api/chat/upload_file Query Parameters:

Parameter	Required	Type	Description
`org_id`	Yes	UUID	Organization ID
`chat_id`	No	UUID	Chat session ID (for organized storage)

Request Body: Multipart form data with file field. Response:

Field	Type	Description
`id`	UUID	Upload record ID (use in send_message)
`url`	string	Presigned URL for file access (7 days)

Supported File Types:

Documents: PDF, TXT, DOCX, XLSX, CSV, MD
Images: JPG, PNG, GIF, WebP, SVG
Audio: MP3, WAV, M4A, OGG, FLAC
Video: MP4, WebM, MOV, AVI

Storage:

Files stored in GCP Cloud Storage
Organized by: {org_id}/{chat_id}/{filename}
Presigned URLs expire in 7 days
Content extraction for compatible formats

Audio Operations

Transcribe Audio

Convert audio files or raw audio data to text.

curl -X POST "{{baseUrl}}/api/chat/transcribe?language=en-US" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -F "file=@/path/to/audio.mp3"

Endpoint: POST /api/chat/transcribe Query Parameters:

Parameter	Required	Type	Description
`language`	No	string	Language code (default: “en-US”)

Request Body: Multipart form data with file field, OR raw bytes with content_type parameter. Supported Languages:

en-US - English (US)
en-GB - English (UK)
es-ES - Spanish
fr-FR - French
de-DE - German
it-IT - Italian
pt-BR - Portuguese (Brazil)
ja-JP - Japanese
ko-KR - Korean
zh-CN - Chinese (Simplified)

Supported Audio Formats:

MP3
WAV
M4A
OGG
FLAC
WebM

Prompt Operations

Generate Prompts

Generate AI-powered prompts based on input text with streaming response.

curl -X POST "{{baseUrl}}/api/chat/prompt" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "artificial intelligence in healthcare",
    "num_prompts": 3,
    "prompt_type": "creative",
    "model": "chat"
  }'

Endpoint: POST /api/chat/prompt Request Body:

Field	Type	Required	Description
`text`	string	Yes	Input text to generate prompts from
`num_prompts`	number	No	Number of prompts to generate (default: 1)
`prompt_type`	string	No	Type: creative, task, question, continuation (default: task)
`model`	string	No	Model: chat, reason (default: chat)

Response: Server-Sent Events stream with format:

data: {"content": "token"}
data: {"content": "DONE"}

Knowledge Base Operations

Get Available Knowledge Bases

Retrieve knowledge bases available for chat integration.

curl -X GET "{{baseUrl}}/api/chat/available_knowledge_bases?org_id=your-org-id" \
  -H "Authorization: Bearer YOUR_TOKEN"

Endpoint: GET /api/chat/available_knowledge_bases Query Parameters:

Parameter	Required	Type	Description
`org_id`	Yes	UUID	Organization ID

Error Responses

Status Code	Description	Example
400	Bad Request	Missing required parameter
401	Unauthorized	Invalid or missing token
402	Payment Required	Insufficient credits
403	Forbidden	Insufficient permissions
404	Not Found	Chat session or resource not found
500	Internal Server Error	Server-side error

Error Response Format:

{
  "detail": "Error message describing what went wrong"
}

Common Error Messages:

// Insufficient credits
{
  "detail": "Insufficient credits to use this model."
}

// Model/Agent required
{
  "detail": "Either model_id or agent_id must be provided"
}

// Chat not found
{
  "detail": "Chat session not found"
}

// LLM provider error
{
  "detail": "LLM provider openai is not configured correctly. Please contact support."
}

Implementation Notes

Streaming Response Handling

All streaming endpoints use Server-Sent Events (SSE) format:

// Browser example
const evtSource = new EventSource('/api/chat/send_message?...');

evtSource.onmessage = (event) => {
  const data = JSON.parse(event.data);

  if (data.message === 'DONE') {
    evtSource.close();
    return;
  }

  console.log(data.message);
};

evtSource.onerror = (error) => {
  console.error('Stream error:', error);
  evtSource.close();
};

Billing Integration

Every chat interaction is billed:

HOLD created at message start (qty=1)
Tokens counted during generation
DEBIT finalized with actual token usage
Usage metadata updated in chat session

Token Calculation:

weighted_total = (input_tokens * input_ratio) + (output_tokens * output_ratio)

File Content Extraction

Files are automatically processed:

Text files: Content extracted and appended to prompt
Images: Passed to vision-capable models
Documents: Text extraction for PDF, DOCX, etc.
DeepSeek models: Special handling for file content

Knowledge Base Integration

KB search happens automatically:

Semantic search on user query
Top N chunks retrieved (configurable, default: 10)
Context injected into prompt
LLM generates KB-aware response

Search Parameters:

limit: Number of chunks (default: 10)
score_threshold: Minimum similarity (default: 0.1)

Auto-Title Generation

For new chats:

Wait for first AI response
Generate title using LLM
Update database
Broadcast via WebSocket
Sync to Firebase

Model Feature Detection

System automatically detects model capabilities:

Text models: Standard chat
Image models: Image generation with progress
Video models: Video generation with GCP upload
Vision models: Image analysis

Agent Integration

When using agents:

Agent must be active
Request sent to agent runtime
Response streamed back
Token usage tracked from Agno session

Rate Limiting

Rate limits enforced via feature flags:

daily_chat_limit: Daily message limit
file_upload: File upload quota
chat_uploads: Chat-specific upload quota

Next Steps

Explore related APIs and features:

Knowledge Base Service - Manage knowledge bases
Agents Service - Create and deploy agents
LLM Service - Manage LLM models
WebSocket Service - Real-time updates

Ready to start building? Check out our Chat Concepts guide or Getting Started tutorial.

🎯 Welcome

🚀 Getting Started

💡 Core Concepts

🏗️ Platform Architecture

🔐 Security & Auth

📡 API Reference

🛠️ Development Guides

🚨 Troubleshooting

Chat System Overview

Authentication

Base URL

Chat Session Management

Create Chat Session

Get Chat Session

List Chat Sessions

Update Chat Session

Delete Chat Session

Bulk Delete Chat Sessions

Message Operations

Send Message

File Upload Operations

Upload File

Audio Operations

Transcribe Audio

Prompt Operations

Generate Prompts

Knowledge Base Operations

Get Available Knowledge Bases

Error Responses

Implementation Notes

Streaming Response Handling

Billing Integration

File Content Extraction

Knowledge Base Integration

Auto-Title Generation

Model Feature Detection

Agent Integration

Rate Limiting

Next Steps

🎯 Welcome

🚀 Getting Started

💡 Core Concepts

🏗️ Platform Architecture

🔐 Security & Auth

📡 API Reference

🛠️ Development Guides

🚨 Troubleshooting

​Chat System Overview

​Authentication

​Base URL

​Chat Session Management

​Create Chat Session

​Get Chat Session

​List Chat Sessions

​Update Chat Session

​Delete Chat Session

​Bulk Delete Chat Sessions

​Message Operations

​Send Message

​File Upload Operations

​Upload File

​Audio Operations

​Transcribe Audio

​Prompt Operations

​Generate Prompts

​Knowledge Base Operations

​Get Available Knowledge Bases

​Error Responses

​Implementation Notes

​Streaming Response Handling

​Billing Integration

​File Content Extraction

​Knowledge Base Integration

​Auto-Title Generation

​Model Feature Detection

​Agent Integration

​Rate Limiting

​Next Steps

Chat System Overview

Authentication

Base URL

Chat Session Management

Create Chat Session

Get Chat Session

List Chat Sessions

Update Chat Session

Delete Chat Session

Bulk Delete Chat Sessions

Message Operations

Send Message

File Upload Operations

Upload File

Audio Operations

Transcribe Audio

Prompt Operations

Generate Prompts

Knowledge Base Operations

Get Available Knowledge Bases

Error Responses

Implementation Notes

Streaming Response Handling

Billing Integration

File Content Extraction

Knowledge Base Integration

Auto-Title Generation

Model Feature Detection

Agent Integration

Rate Limiting

Next Steps