MCP Playground Overview
Authentication
Requires a valid Bearer token withmcp:write permission.
Base URL
Chat with MCP
Send Message to MCP Playground
Send a message and receive streaming responses from AI using MCP tools.POST /api/mcp_playground/chat
Request Body:
| Field | Type | Required | Description |
|---|---|---|---|
content | string | Yes | Userโs message/query |
mcp_instance_ids | array | No | MCP instance UUIDs to use (Direct Mode) |
model | string | No | LLM model to use (defaults based on mode) |
| Model Pattern | Provider | Default Behavior | Use Case |
|---|---|---|---|
gpt-*, o1* | OpenAI | - | General purpose, reasoning |
claude-* | Anthropic | - | Tool use, long context |
deepseek-* | DeepSeek | Intelligent Mode: deepseek-chat | Database queries, cost-effective |
gemini-* | Direct Mode: gemini-2.5-flash | Fast tool execution |
- Intelligent Discovery Mode (no
mcp_instance_ids): Defaults todeepseek-chatifmodelnot provided - Direct MCP Mode (with
mcp_instance_ids): Defaults togemini-2.5-flashifmodelnot provided
Stream Event Types
1. Message Content
Regular AI response tokens.2. Tool Call Started
AI initiates a tool call.3. Tool Call Completed
Tool execution finished.| Field | Type | Description |
|---|---|---|
successful | boolean | Whether tool executed successfully |
error | string | Error message if failed (null if successful) |
output | object | Tool-specific result data |
| Field | Type | Description |
|---|---|---|
nextPageToken | string | Token for next page of results |
resultSizeEstimate | number | Total number of results available |
4. MCP Server Recommendation
Intelligent Mode suggests servers to connect.- Extract
server_ids - Fetch server details via MCP Service
- Prompt user to connect servers
- After connection, resend message with
mcp_instance_ids
5. Completion Signal
Stream finished.6. Error
Stream encountered an error.Mode Details
Intelligent Discovery Mode
Trigger: Nomcp_instance_ids provided
Behavior:
- AI analyzes user query
- Searches database for relevant MCP servers
- Returns streaming response with recommendations
- Includes
{"mcp": {"server_ids": [...]}}in stream
- Natural Language Understanding: Analyzes intent
- Database Search: Queries mcp_servers and mcp_tools tables
- Context Awareness: Uses conversation history (30 exchanges)
- Smart Recommendations: Suggests relevant servers only
- Conversational: Continues chatting while recommending
Direct MCP Mode
Trigger:mcp_instance_ids provided
Behavior:
- Generates MCP URLs for specified instances
- Connects to all MCP servers
- Initializes AI agent with multi-MCP tools
- Streams tool calls and responses
- Maintains conversation history (16 exchanges)
- Parallel Connections: Connect to multiple servers
- Tool Orchestration: AI chains tools across servers
- Context Preservation: Full conversation history
- Metadata Tracking: Pagination tokens preserved
- XML Prompting: Structured context for better responses
Conversation Management
Memory System
Conversation Storage:- Intelligent Mode: 60 messages max (30 exchanges) - hardcoded
- Direct Mode: 30 messages - passed as memory_size parameter to playground factory
- TTL: 30 minutes of inactivity (hardcoded)
- Format: Chronological array of
{role, content, created_at}
- โShow me moreโ - AI remembers previous query
- โSend that to Slackโ - AI knows what โthatโ refers to
- โWhat was the first email about?โ - AI recalls earlier context
Tool Metadata Tracking
Pagination tokens and metadata automatically preserved:Error Handling
Connection Errors
- MCP server unavailable
- Invalid MCP URL
- Network issues
- Verify MCP instance is active
- Check network connectivity
- Try reconnecting MCP instance
Timeout Errors
- MCP server too slow (>30s)
- Heavy computation
- Network latency
- Retry request
- Use faster MCP server
- Break into smaller operations
Tool Errors
- Insufficient permissions
- Invalid parameters
- Resource not found
- Rate limit exceeded
No Valid Instances
- Instance IDs donโt exist
- Instances not active
- User doesnโt own instances
- Check instance IDs
- Verify instances are active via
/api/mcp/list_instances - Reconnect MCP servers
Advanced Features
Multi-Server Orchestration
Chain operations across multiple MCP servers:Contextual Follow-ups
Leverage conversation history for natural interactions:Complex Searches
Use comprehensive search strategies:Best Practices
For Frontend Developers
-
Handle Both Modes:
-
Display Tool Execution:
-
Buffer Message Tokens:
-
Handle Errors Gracefully:
For Users
- Be Specific: โShow urgent emails from last weekโ vs โshow emailsโ
- Use Natural Language: The AI understands intent
- Leverage History: Reference previous messages
- Connect Relevant Servers: Only connect what youโll use
- Review Tool Calls: Understand what the AI is doing
For Developers
-
Model Selection:
- Intelligent Mode Default:
deepseek-chat(hardcoded, optimized for DB queries) - Direct Mode Default:
gemini-2.5-flash(hardcoded, optimized for tool execution) - Override for Quality: Use
gpt-4,o1, orclaude-3.5-sonnetfor complex reasoning - Override for Context: Use
claude-3-opusfor long conversations - Model string must start with:
gpt,o1,claude,deepseek, orgemini
- Intelligent Mode Default:
-
Instance Management:
- Cache active instance IDs
- Refresh list periodically
- Handle instance expiration
-
Error Recovery:
- Retry on timeout (once)
- Fallback to simpler queries
- Clear instructions on permission errors
Rate Limiting
- Feature flag:
mcp_sessionsquota - Concurrent MCP connections: No hard limit (performance degrades)
- Message rate: No explicit limit
- Tool execution timeout: 30 seconds per tool
Security
- Authentication: JWT required
- RBAC:
mcp:writepermission required - Org Isolation: Sessions scoped to organizations
- Data Privacy: Conversations cleared after 30 minutes
- Tool Safety: User owns all MCP connections
Performance Considerations
Optimize for Speed
-
Use Fast Models:
- Default
gemini-2.5-flashfor Direct Mode (optimized for tool use) - Default
deepseek-chatfor Intelligent Mode (optimized for DB queries) - Override with faster models if needed
- Default
-
Limit MCP Servers:
- Only connect necessary servers
- More servers = slower initialization
-
Reduce History:
- Default 30 messages for Direct Mode
- Reduce memory_size parameter for faster context loading
-
Batch Operations:
- โProcess first 10 emailsโ vs โProcess allโ
Optimize for Quality
-
Use Reasoning Models:
- Override defaults with
gpt-4oro1for complex logic - Use
claude-3.5-sonnetfor advanced tool orchestration - Defaults are optimized for speed, not quality
- Override defaults with
-
Provide Context:
- Longer queries = better understanding
- Reference previous conversations
- Use conversation history features
-
Note on History:
- Intelligent Mode: Fixed at 30 exchanges
- Direct Mode: Fixed at 30 messages (memory_size hardcoded in service)
- Cannot be increased without code changes
Next Steps
Continue exploring MCP integration:- MCP Service API - Manage MCP connections
- MCP Concepts - Deep dive into architecture
- Chat Service - Standard chat without MCP
- Authentication - Security and permissions