Skip to main content
The Chat System is the core conversational interface in Definable.ai that enables users to interact with AI models and agents through natural language conversations. It provides a complete messaging infrastructure with support for multi-turn dialogues, file attachments, knowledge base integration, and real-time streaming responses.

What is a Chat?

A Chat (also called Chat Session) is a conversational thread that:
  • Maintains Context: Keeps track of the entire conversation history
  • Supports Multi-turn Dialogue: Allows back-and-forth exchanges between users and AI
  • Integrates Knowledge: Can leverage knowledge bases for enhanced responses
  • Handles Media: Supports file uploads including documents, images, audio, and video
  • Streams Responses: Provides real-time token-by-token streaming for immediate feedback
  • Tracks Usage: Monitors token consumption and billing information

Chat Architecture

Core Components

1. Chat Sessions

A chat session is the container for an entire conversation. Key Features:
  • Unique Identifier: Each session has a UUID for tracking
  • Title Management: Auto-generates meaningful titles from conversation content
  • Status Tracking: Active, Archived, or Deleted states
  • Metadata Storage: Flexible JSON storage for custom data
  • Settings Persistence: Saves user preferences for temperature, max tokens, etc.
Session Lifecycle:

2. Messages

Messages are the individual exchanges within a chat session. Message Types:
  • USER: Messages sent by the human user
  • MODEL: Responses generated by LLM models
  • AGENT: Responses from AI agents
Message Structure:
  • Content (text/media)
  • Role (USER/MODEL/AGENT)
  • Parent message ID (for threading)
  • Model or Agent ID
  • Prompt/Instruction ID
  • Metadata (knowledge base IDs, file references)
  • Timestamps
Message Threading:

3. File Uploads

The chat system supports rich media attachments. Supported File Types:
  • Documents: PDF, TXT, DOCX, XLSX, CSV
  • Images: JPG, PNG, GIF, WebP
  • Audio: MP3, WAV, M4A, OGG
  • Video: MP4, WebM, MOV
Upload Process: File Processing:
  • Files are uploaded to GCP Storage
  • Presigned URLs generated for secure access
  • Content extraction for text-based files
  • Image/video processing for multimodal models

4. Chat Settings

Per-session LLM parameters that override defaults. Available Settings:
  • temperature: Controls randomness (0.0 - 1.0)
  • max_tokens: Maximum response length
  • top_p: Nucleus sampling threshold
Settings Hierarchy:
  1. Request-level parameters (highest priority)
  2. Saved session settings
  3. Model defaults (lowest priority)

Chat Features

Real-time Streaming

Responses stream token-by-token for immediate user feedback. Streaming Flow: Stream Format:
data: {"message": "Hello"}
data: {"message": " there"}
data: {"message": "!"}
data: {"message": "DONE"}
Special Stream Types:
  • Text Tokens: Regular chat responses
  • Reasoning Steps: For models with reasoning capabilities
  • Image Generation: Progress updates and final URLs
  • Video Generation: Progress updates and final URLs
  • Error Messages: Graceful error handling

Knowledge Base Integration

Chats can leverage knowledge bases for enhanced responses. Integration Flow: How It Works:
  1. User specifies knowledge base IDs in the message
  2. System performs semantic search on user query
  3. Top relevant chunks retrieved (configurable limit)
  4. Context injected into the prompt
  5. LLM generates response using both its knowledge and KB context
Enhanced Prompt Format:
[System Prompt]

KNOWLEDGE BASE CONTEXT:
[Knowledge Base Context]: Retrieved information from KB 1
[Knowledge Base Context]: Retrieved information from KB 2

Use the above context to answer the user's question when relevant.

USER QUESTION: [User's actual question]

Audio Transcription

Convert speech to text for voice-based interactions. Transcription Features:
  • Multiple Formats: MP3, WAV, M4A, OGG
  • Language Support: Multi-language transcription
  • Streaming Support: Process audio files of any size
  • High Accuracy: Powered by advanced speech recognition
Usage Flow:

Prompt Generation

AI-powered prompt enhancement and generation. Prompt Types:
  • Creative: Generate creative writing prompts
  • Task: Create task-oriented prompts
  • Question: Generate insightful questions
  • Continuation: Suggest conversation continuations
Streaming Prompt Generation:

Multi-Modal Support

Advanced models can handle various content types. Capabilities:
  • Text Generation: Standard chat responses
  • Image Generation: Create images from text descriptions
  • Video Generation: Generate short video clips
  • Vision: Analyze uploaded images
  • Document Understanding: Extract and analyze document content
Model Selection:
# Text-only chat
model_id = "gpt-4-turbo"

# Image generation
model_id = "dall-e-3"  # Supports image generation

# Video generation
model_id = "runway-gen3"  # Supports video generation

Billing and Usage Tracking

Credit System

Every chat interaction is tracked and billed. Billing Flow: Token Calculation:
# Calculate weighted tokens based on model pricing
input_tokens = len(message.split())
output_tokens = response_token_count

# Get model-specific pricing
pricing = {
    "input": 1.0,   # credits per 1000 input tokens
    "output": 2.0   # credits per 1000 output tokens
}

# Calculate total cost
weighted_total = (input_tokens * pricing["input"]) + (output_tokens * pricing["output"])
Metadata Stored:
  • Input tokens
  • Output tokens
  • Total tokens (weighted)
  • Cached tokens (if applicable)
  • Pricing ratios
  • Message IDs

Usage Metadata

Chat sessions track cumulative usage. Tracked Metrics:
{
  "usage": {
    "input_tokens": 1500,
    "output_tokens": 3000,
    "total_tokens": 4500,
    "cached_tokens": 500
  }
}

Chat Operations

Creating a Chat Session

Auto-Created Sessions:
  • If no chat_id provided in send_message
  • System creates session automatically
  • Title set to β€œNew Chat”
  • Status set to ACTIVE

Updating Chat Sessions

Updatable Fields:
  • Title (manual or auto-generated)
  • Status (ACTIVE, ARCHIVED, DELETED)
  • Settings (temperature, max_tokens, top_p)
Auto-Title Generation:

Deleting Chat Sessions

Single Delete:
  • Soft delete (status change to DELETED)
  • Cascade deletes messages
  • Removes file upload links
Bulk Delete:
  • Delete multiple sessions in one request
  • Validates ownership
  • Returns count of deleted sessions

Advanced Features

Agent Integration

Chats can interact with deployed AI agents. Agent Chat Flow: Agent Requirements:
  • Agent must be active
  • Agent must be deployed
  • User must have access to agent

Instruction/Prompt System

Use pre-defined prompts to guide AI behavior. Prompt Integration: Prompt Components:
  • ID: Unique identifier
  • Title: Display name
  • Description: Purpose description
  • Content: Actual prompt text

WebSocket Updates

Real-time updates for chat state changes. Events Broadcasted:
  • Chat title updates (after auto-generation)
  • New message notifications
  • Status changes
WebSocket Message Format:
{
  "data": {
    "id": "chat-uuid",
    "title": "Updated Title",
    "user_id": "user-uuid",
    "org_id": "org-uuid"
  }
}

Firebase Integration

Chat updates sync to Firebase Realtime Database. Firebase Path:
{org_id}/chats_write/data
Data Structure:
{
  "id": "chat-uuid",
  "title": "Chat Title"
}

Best Practices

Chat Management

  1. Session Organization: Use meaningful titles and archive old chats
  2. Settings Optimization: Adjust temperature and max_tokens per use case
  3. Knowledge Base Selection: Only include relevant KBs to reduce noise
  4. File Management: Clean up unused uploads regularly

Performance Optimization

  1. Streaming: Always use streaming for better UX
  2. Context Window: Monitor message count and summarize long conversations
  3. Token Limits: Set appropriate max_tokens to control costs
  4. Caching: Leverage cached tokens when available

Error Handling

  1. Billing Failures: Handle insufficient credits gracefully
  2. LLM Errors: Display user-friendly error messages
  3. File Upload Limits: Validate file size and type before upload
  4. Timeout Handling: Set appropriate timeouts for long operations

Security Considerations

  1. Access Control: Use RBAC to control chat access
  2. File Validation: Validate file types and scan for malware
  3. Content Filtering: Implement content moderation as needed
  4. Data Privacy: Handle sensitive conversations appropriately

Common Use Cases

Customer Support Bot

Configuration:
  • Temperature: 0.3 (more deterministic)
  • Knowledge Bases: FAQ, Product Docs, Support Articles
  • Prompts: Customer service instructions

Code Assistant

Features:
  • File upload for code review
  • Multi-turn debugging conversations
  • Code generation with context
  • Documentation lookup via KB
Configuration:
  • Temperature: 0.7 (balanced creativity)
  • Models: GPT-4, Claude-3.5-Sonnet
  • File types: .py, .js, .java, etc.

Research Assistant

Features:
  • Document upload and analysis
  • Knowledge base integration
  • Citation tracking
  • Long-form responses
Configuration:
  • Temperature: 0.5 (factual)
  • Max tokens: 4096 (long responses)
  • Knowledge Bases: Research papers, articles

Creative Writing

Features:
  • Prompt generation
  • Story continuation
  • Character development
  • Style adaptation
Configuration:
  • Temperature: 0.9 (highly creative)
  • Prompt type: β€œcreative”
  • Models: GPT-4, Claude-3-Opus

Performance Metrics

Quality Metrics

  • Response Relevance: Accuracy of AI responses
  • Context Retention: How well context is maintained
  • Knowledge Integration: Effectiveness of KB usage
  • Error Rate: Frequency of failed interactions

Efficiency Metrics

  • Response Time: Time to first token
  • Streaming Speed: Tokens per second
  • Token Usage: Input/output token counts
  • Cost per Chat: Average credits consumed

Usage Metrics

  • Active Sessions: Number of ongoing chats
  • Message Volume: Total messages sent
  • File Uploads: Number and size of uploads
  • Knowledge Base Hits: KB search frequency

Troubleshooting

Common Issues

Slow Responses:
  • Check LLM provider status
  • Reduce max_tokens
  • Optimize knowledge base searches
  • Monitor network latency
Billing Errors:
  • Verify sufficient credits
  • Check transaction logs
  • Review usage metadata
  • Contact support for discrepancies
Context Loss:
  • Keep conversations under context limit
  • Implement conversation summarization
  • Use parent_message_id correctly
  • Check message ordering
File Upload Failures:
  • Validate file size limits
  • Check GCP storage configuration
  • Verify content type
  • Review network connectivity

Next Steps

Explore related concepts and start building:
  • Knowledge Base - Enhance chats with domain knowledge
  • AI Agents - Build autonomous conversational agents
  • LLM Models - Choose the right model for your use case
  • Tools - Extend chat capabilities with custom functions
Ready to start chatting? Check out the Chat API Reference or our Getting Started Guide.