Chat System - Definable.ai

The Chat System is the core conversational interface in Definable.ai that enables users to interact with AI models and agents through natural language conversations. It provides a complete messaging infrastructure with support for multi-turn dialogues, file attachments, knowledge base integration, and real-time streaming responses.

What is a Chat?

A Chat (also called Chat Session) is a conversational thread that:

Maintains Context: Keeps track of the entire conversation history
Supports Multi-turn Dialogue: Allows back-and-forth exchanges between users and AI
Integrates Knowledge: Can leverage knowledge bases for enhanced responses
Handles Media: Supports file uploads including documents, images, audio, and video
Streams Responses: Provides real-time token-by-token streaming for immediate feedback
Tracks Usage: Monitors token consumption and billing information

Chat Architecture

Core Components

1. Chat Sessions

A chat session is the container for an entire conversation. Key Features:

Unique Identifier: Each session has a UUID for tracking
Title Management: Auto-generates meaningful titles from conversation content
Status Tracking: Active, Archived, or Deleted states
Metadata Storage: Flexible JSON storage for custom data
Settings Persistence: Saves user preferences for temperature, max tokens, etc.

Session Lifecycle:

2. Messages

Messages are the individual exchanges within a chat session. Message Types:

USER: Messages sent by the human user
MODEL: Responses generated by LLM models
AGENT: Responses from AI agents

Message Structure:

Content (text/media)
Role (USER/MODEL/AGENT)
Parent message ID (for threading)
Model or Agent ID
Prompt/Instruction ID
Metadata (knowledge base IDs, file references)
Timestamps

Message Threading:

3. File Uploads

The chat system supports rich media attachments. Supported File Types:

Documents: PDF, TXT, DOCX, XLSX, CSV
Images: JPG, PNG, GIF, WebP
Audio: MP3, WAV, M4A, OGG
Video: MP4, WebM, MOV

Upload Process: File Processing:

Files are uploaded to GCP Storage
Presigned URLs generated for secure access
Content extraction for text-based files
Image/video processing for multimodal models

4. Chat Settings

Per-session LLM parameters that override defaults. Available Settings:

temperature: Controls randomness (0.0 - 1.0)
max_tokens: Maximum response length
top_p: Nucleus sampling threshold

Settings Hierarchy:

Request-level parameters (highest priority)
Saved session settings
Model defaults (lowest priority)

Chat Features

Real-time Streaming

Responses stream token-by-token for immediate user feedback. Streaming Flow: Stream Format:

data: {"message": "Hello"}
data: {"message": " there"}
data: {"message": "!"}
data: {"message": "DONE"}

Special Stream Types:

Text Tokens: Regular chat responses
Reasoning Steps: For models with reasoning capabilities
Image Generation: Progress updates and final URLs
Video Generation: Progress updates and final URLs
Error Messages: Graceful error handling

Knowledge Base Integration

Chats can leverage knowledge bases for enhanced responses. Integration Flow: How It Works:

User specifies knowledge base IDs in the message
System performs semantic search on user query
Top relevant chunks retrieved (configurable limit)
Context injected into the prompt
LLM generates response using both its knowledge and KB context

Enhanced Prompt Format:

[System Prompt]

KNOWLEDGE BASE CONTEXT:
[Knowledge Base Context]: Retrieved information from KB 1
[Knowledge Base Context]: Retrieved information from KB 2

Use the above context to answer the user's question when relevant.

USER QUESTION: [User's actual question]

Audio Transcription

Convert speech to text for voice-based interactions. Transcription Features:

Multiple Formats: MP3, WAV, M4A, OGG
Language Support: Multi-language transcription
Streaming Support: Process audio files of any size
High Accuracy: Powered by advanced speech recognition

Usage Flow:

Prompt Generation

AI-powered prompt enhancement and generation. Prompt Types:

Creative: Generate creative writing prompts
Task: Create task-oriented prompts
Question: Generate insightful questions
Continuation: Suggest conversation continuations

Streaming Prompt Generation: Advanced models can handle various content types. Capabilities:

Text Generation: Standard chat responses
Image Generation: Create images from text descriptions
Video Generation: Generate short video clips
Vision: Analyze uploaded images
Document Understanding: Extract and analyze document content

Model Selection:

# Text-only chat
model_id = "gpt-4-turbo"

# Image generation
model_id = "dall-e-3"  # Supports image generation

# Video generation
model_id = "runway-gen3"  # Supports video generation

Billing and Usage Tracking

Credit System

Every chat interaction is tracked and billed. Billing Flow: Token Calculation:

# Calculate weighted tokens based on model pricing
input_tokens = len(message.split())
output_tokens = response_token_count

# Get model-specific pricing
pricing = {
    "input": 1.0,   # credits per 1000 input tokens
    "output": 2.0   # credits per 1000 output tokens
}

# Calculate total cost
weighted_total = (input_tokens * pricing["input"]) + (output_tokens * pricing["output"])

Metadata Stored:

Input tokens
Output tokens
Total tokens (weighted)
Cached tokens (if applicable)
Pricing ratios
Message IDs

Usage Metadata

Chat sessions track cumulative usage. Tracked Metrics:

{
  "usage": {
    "input_tokens": 1500,
    "output_tokens": 3000,
    "total_tokens": 4500,
    "cached_tokens": 500
  }
}

Chat Operations

Creating a Chat Session

Auto-Created Sessions:

If no chat_id provided in send_message
System creates session automatically
Title set to “New Chat”
Status set to ACTIVE

Updating Chat Sessions

Updatable Fields:

Title (manual or auto-generated)
Status (ACTIVE, ARCHIVED, DELETED)
Settings (temperature, max_tokens, top_p)

Auto-Title Generation:

Deleting Chat Sessions

Single Delete:

Soft delete (status change to DELETED)
Cascade deletes messages
Removes file upload links

Bulk Delete:

Delete multiple sessions in one request
Validates ownership
Returns count of deleted sessions

Advanced Features

Agent Integration

Chats can interact with deployed AI agents. Agent Chat Flow: Agent Requirements:

Agent must be active
Agent must be deployed
User must have access to agent

Instruction/Prompt System

Use pre-defined prompts to guide AI behavior. Prompt Integration: Prompt Components:

ID: Unique identifier
Title: Display name
Description: Purpose description
Content: Actual prompt text

WebSocket Updates

Real-time updates for chat state changes. Events Broadcasted:

Chat title updates (after auto-generation)
New message notifications
Status changes

WebSocket Message Format:

{
  "data": {
    "id": "chat-uuid",
    "title": "Updated Title",
    "user_id": "user-uuid",
    "org_id": "org-uuid"
  }
}

Firebase Integration

Chat updates sync to Firebase Realtime Database. Firebase Path:

{org_id}/chats_write/data

Data Structure:

{
  "id": "chat-uuid",
  "title": "Chat Title"
}

Best Practices

Chat Management

Session Organization: Use meaningful titles and archive old chats
Settings Optimization: Adjust temperature and max_tokens per use case
Knowledge Base Selection: Only include relevant KBs to reduce noise
File Management: Clean up unused uploads regularly

Performance Optimization

Streaming: Always use streaming for better UX
Context Window: Monitor message count and summarize long conversations
Token Limits: Set appropriate max_tokens to control costs
Caching: Leverage cached tokens when available

Error Handling

Billing Failures: Handle insufficient credits gracefully
LLM Errors: Display user-friendly error messages
File Upload Limits: Validate file size and type before upload
Timeout Handling: Set appropriate timeouts for long operations

Security Considerations

Access Control: Use RBAC to control chat access
File Validation: Validate file types and scan for malware
Content Filtering: Implement content moderation as needed
Data Privacy: Handle sensitive conversations appropriately

Common Use Cases

Customer Support Bot

Configuration:

Temperature: 0.3 (more deterministic)
Knowledge Bases: FAQ, Product Docs, Support Articles
Prompts: Customer service instructions

Code Assistant

Features:

File upload for code review
Multi-turn debugging conversations
Code generation with context
Documentation lookup via KB

Configuration:

Temperature: 0.7 (balanced creativity)
Models: GPT-4, Claude-3.5-Sonnet
File types: .py, .js, .java, etc.

Research Assistant

Features:

Document upload and analysis
Knowledge base integration
Citation tracking
Long-form responses

Configuration:

Temperature: 0.5 (factual)
Max tokens: 4096 (long responses)
Knowledge Bases: Research papers, articles

Creative Writing

Features:

Prompt generation
Story continuation
Character development
Style adaptation

Configuration:

Temperature: 0.9 (highly creative)
Prompt type: “creative”
Models: GPT-4, Claude-3-Opus

Performance Metrics

Quality Metrics

Response Relevance: Accuracy of AI responses
Context Retention: How well context is maintained
Knowledge Integration: Effectiveness of KB usage
Error Rate: Frequency of failed interactions

Efficiency Metrics

Response Time: Time to first token
Streaming Speed: Tokens per second
Token Usage: Input/output token counts
Cost per Chat: Average credits consumed

Usage Metrics

Active Sessions: Number of ongoing chats
Message Volume: Total messages sent
File Uploads: Number and size of uploads
Knowledge Base Hits: KB search frequency

Troubleshooting

Common Issues

Slow Responses:

Check LLM provider status
Reduce max_tokens
Optimize knowledge base searches
Monitor network latency

Billing Errors:

Verify sufficient credits
Check transaction logs
Review usage metadata
Contact support for discrepancies

Context Loss:

Keep conversations under context limit
Implement conversation summarization
Use parent_message_id correctly
Check message ordering

File Upload Failures:

Validate file size limits
Check GCP storage configuration
Verify content type
Review network connectivity

Next Steps

Explore related concepts and start building:

Knowledge Base - Enhance chats with domain knowledge
AI Agents - Build autonomous conversational agents
LLM Models - Choose the right model for your use case
Tools - Extend chat capabilities with custom functions

Ready to start chatting? Check out the Chat API Reference or our Getting Started Guide.

🎯 Welcome

🚀 Getting Started

💡 Core Concepts

🏗️ Platform Architecture

🔐 Security & Auth

📡 API Reference

🛠️ Development Guides

🚨 Troubleshooting

​What is a Chat?

​Chat Architecture

​Core Components

​1. Chat Sessions

​2. Messages

​3. File Uploads

​4. Chat Settings

​Chat Features

​Real-time Streaming

​Knowledge Base Integration

​Audio Transcription

​Prompt Generation

​Multi-Modal Support

​Billing and Usage Tracking

​Credit System

​Usage Metadata

​Chat Operations

​Creating a Chat Session

​Updating Chat Sessions

​Deleting Chat Sessions

​Advanced Features

​Agent Integration

​Instruction/Prompt System

​WebSocket Updates

​Firebase Integration

​Best Practices

​Chat Management

​Performance Optimization

​Error Handling

​Security Considerations

​Common Use Cases

​Customer Support Bot

​Code Assistant

​Research Assistant

​Creative Writing

​Performance Metrics

​Quality Metrics

​Efficiency Metrics

​Usage Metrics

​Troubleshooting

​Common Issues

​Next Steps

What is a Chat?

Chat Architecture

Core Components

1. Chat Sessions

2. Messages

3. File Uploads

4. Chat Settings

Chat Features

Real-time Streaming

Knowledge Base Integration

Audio Transcription

Prompt Generation

Multi-Modal Support

Billing and Usage Tracking

Credit System

Usage Metadata

Chat Operations

Creating a Chat Session

Updating Chat Sessions

Deleting Chat Sessions

Advanced Features

Agent Integration

Instruction/Prompt System

WebSocket Updates

Firebase Integration

Best Practices

Chat Management

Performance Optimization

Error Handling

Security Considerations

Common Use Cases

Customer Support Bot

Code Assistant

Research Assistant

Creative Writing

Performance Metrics

Quality Metrics

Efficiency Metrics

Usage Metrics

Troubleshooting

Common Issues

Next Steps