Chat System Overview
Authentication
All endpoints require a valid Bearer token in the Authorization header and appropriate RBAC permissions. Required Permissions:chats:read- Read chat sessions and messageschats:write- Create chats and send messageschats:delete- Delete chat sessions
Base URL
Chat Session Management
Create Chat Session
Create a new chat session for organizing conversations.POST /api/chat
Query Parameters:
| Parameter | Required | Type | Description |
|---|---|---|---|
org_id | Yes | UUID | Organization ID |
| Field | Type | Required | Description |
|---|---|---|---|
title | string | No | Chat session title (default: “New Chat”) |
status | string | No | Session status: ACTIVE, ARCHIVED, DELETED |
settings | object | No | LLM generation settings |
settings.temperature | number | No | Sampling temperature (0.0-1.0) |
settings.max_tokens | number | No | Maximum response tokens |
settings.top_p | number | No | Nucleus sampling threshold |
Get Chat Session
Retrieve a single chat session with all its messages and file uploads.GET /api/chat
Query Parameters:
| Parameter | Required | Type | Description |
|---|---|---|---|
chat_id | Yes | UUID | Chat session ID |
org_id | Yes | UUID | Organization ID |
List Chat Sessions
Retrieve all chat sessions for the authenticated user.GET /api/chat/list
Query Parameters:
| Parameter | Required | Type | Description |
|---|---|---|---|
org_id | Yes | UUID | Organization ID |
status | No | string | Filter by status: ACTIVE, ARCHIVED, DELETED |
Update Chat Session
Update an existing chat session’s title, status, or settings.PUT /api/chat
Query Parameters:
| Parameter | Required | Type | Description |
|---|---|---|---|
chat_id | Yes | UUID | Chat session ID |
org_id | Yes | UUID | Organization ID |
| Field | Type | Required | Description |
|---|---|---|---|
title | string | No | New chat title |
status | string | No | New status: ACTIVE, ARCHIVED, DELETED |
settings | object | No | Updated LLM settings |
Delete Chat Session
Delete a single chat session and all its messages.DELETE /api/chat/delete_session
Query Parameters:
| Parameter | Required | Type | Description |
|---|---|---|---|
session_id | Yes | UUID | Chat session ID to delete |
Bulk Delete Chat Sessions
Delete multiple chat sessions in a single request.POST /api/chat/bulk_delete_sessions
Request Body:
| Field | Type | Required | Description |
|---|---|---|---|
chat_ids | array | Yes | Array of chat session UUIDs to delete |
Message Operations
Send Message
Send a message in a chat session and receive a streaming AI response. This is the primary endpoint for conversational interactions.POST /api/chat/send_message
Query Parameters:
| Parameter | Required | Type | Description |
|---|---|---|---|
org_id | Yes | UUID | Organization ID |
model_id | Conditional | UUID | LLM model ID (required if agent_id not provided) |
agent_id | Conditional | UUID | Agent ID (required if model_id not provided) |
chat_id | No | UUID | Chat session ID (creates new if not provided) |
instruction_id | No | UUID | Prompt/instruction ID to guide response |
temperature | No | float | Override temperature (0.0-1.0) |
max_tokens | No | int | Override max response tokens |
top_p | No | float | Override nucleus sampling (0.0-1.0) |
| Field | Type | Required | Description |
|---|---|---|---|
content | string | Yes | Message content/question |
thinking | boolean | No | Enable reasoning mode (for compatible models) |
file_uploads | array | No | Array of upload UUIDs to attach |
knowledge_base_ids | array | No | Array of KB UUIDs to search |
| Type | Format | Description |
|---|---|---|
| Token | {"message": "text"} | Regular text token |
| Reasoning | {"type": "reasoning", "message": "..."} | Reasoning step |
| Image | {"media": {...}} | Image generation progress |
| Video | {"media": {...}} | Video generation progress |
| Complete | {"message": "DONE"} | Stream finished |
| Error | {"error": "message"} | Error occurred |
- Auto-Chat Creation: Creates new chat if chat_id not provided
- Auto-Title Generation: Generates meaningful title for new chats
- Knowledge Base Search: Searches specified KBs and enhances prompt
- File Content Extraction: Extracts text from uploaded files
- Billing Integration: Tracks token usage and credits
- WebSocket Broadcasting: Broadcasts title updates
- Firebase Sync: Syncs chat updates to Firebase
- Model Feature Detection: Automatically handles text/image/video generation
- Error Recovery: Gracefully handles LLM provider errors
File Upload Operations
Upload File
Upload a file to be attached to chat messages.POST /api/chat/upload_file
Query Parameters:
| Parameter | Required | Type | Description |
|---|---|---|---|
org_id | Yes | UUID | Organization ID |
chat_id | No | UUID | Chat session ID (for organized storage) |
| Field | Type | Description |
|---|---|---|
id | UUID | Upload record ID (use in send_message) |
url | string | Presigned URL for file access (7 days) |
- Documents: PDF, TXT, DOCX, XLSX, CSV, MD
- Images: JPG, PNG, GIF, WebP, SVG
- Audio: MP3, WAV, M4A, OGG, FLAC
- Video: MP4, WebM, MOV, AVI
- Files stored in GCP Cloud Storage
- Organized by:
{org_id}/{chat_id}/{filename} - Presigned URLs expire in 7 days
- Content extraction for compatible formats
Audio Operations
Transcribe Audio
Convert audio files or raw audio data to text.POST /api/chat/transcribe
Query Parameters:
| Parameter | Required | Type | Description |
|---|---|---|---|
language | No | string | Language code (default: “en-US”) |
en-US- English (US)en-GB- English (UK)es-ES- Spanishfr-FR- Frenchde-DE- Germanit-IT- Italianpt-BR- Portuguese (Brazil)ja-JP- Japaneseko-KR- Koreanzh-CN- Chinese (Simplified)
- MP3
- WAV
- M4A
- OGG
- FLAC
- WebM
Prompt Operations
Generate Prompts
Generate AI-powered prompts based on input text with streaming response.POST /api/chat/prompt
Request Body:
| Field | Type | Required | Description |
|---|---|---|---|
text | string | Yes | Input text to generate prompts from |
num_prompts | number | No | Number of prompts to generate (default: 1) |
prompt_type | string | No | Type: creative, task, question, continuation (default: task) |
model | string | No | Model: chat, reason (default: chat) |
Knowledge Base Operations
Get Available Knowledge Bases
Retrieve knowledge bases available for chat integration.GET /api/chat/available_knowledge_bases
Query Parameters:
| Parameter | Required | Type | Description |
|---|---|---|---|
org_id | Yes | UUID | Organization ID |
Error Responses
| Status Code | Description | Example |
|---|---|---|
| 400 | Bad Request | Missing required parameter |
| 401 | Unauthorized | Invalid or missing token |
| 402 | Payment Required | Insufficient credits |
| 403 | Forbidden | Insufficient permissions |
| 404 | Not Found | Chat session or resource not found |
| 500 | Internal Server Error | Server-side error |
Implementation Notes
Streaming Response Handling
All streaming endpoints use Server-Sent Events (SSE) format:Billing Integration
Every chat interaction is billed:- HOLD created at message start (qty=1)
- Tokens counted during generation
- DEBIT finalized with actual token usage
- Usage metadata updated in chat session
File Content Extraction
Files are automatically processed:- Text files: Content extracted and appended to prompt
- Images: Passed to vision-capable models
- Documents: Text extraction for PDF, DOCX, etc.
- DeepSeek models: Special handling for file content
Knowledge Base Integration
KB search happens automatically:- Semantic search on user query
- Top N chunks retrieved (configurable, default: 10)
- Context injected into prompt
- LLM generates KB-aware response
limit: Number of chunks (default: 10)score_threshold: Minimum similarity (default: 0.1)
Auto-Title Generation
For new chats:- Wait for first AI response
- Generate title using LLM
- Update database
- Broadcast via WebSocket
- Sync to Firebase
Model Feature Detection
System automatically detects model capabilities:- Text models: Standard chat
- Image models: Image generation with progress
- Video models: Video generation with GCP upload
- Vision models: Image analysis
Agent Integration
When using agents:- Agent must be active
- Request sent to agent runtime
- Response streamed back
- Token usage tracked from Agno session
Rate Limiting
Rate limits enforced via feature flags:daily_chat_limit: Daily message limitfile_upload: File upload quotachat_uploads: Chat-specific upload quota
Next Steps
Explore related APIs and features:- Knowledge Base Service - Manage knowledge bases
- Agents Service - Create and deploy agents
- LLM Service - Manage LLM models
- WebSocket Service - Real-time updates