Skip to main content
Proper configuration of your AI Agent is crucial for optimal performance. Each setting affects how your Agent communicates and processes information.

Core Settings

1. Select Language

Choose the primary language for your AI Agent’s interactions.
Select a language that matches your target audience. This affects both understanding and response capabilities.

2. Select Model

Choose the AI model that will power your Agent’s intelligence.

OpenAI Models

GPT-4.1

Highest intelligence
  • Ideal for complex tasks and advanced instruction following
  • Supports up to 1 million tokens for extensive context handling
  • Excels in reasoning, and long-context comprehension
  • Suitable for use in Agentic AI Agents
  • Best choice for tool integration and sophisticated applications

GPT-4.1-mini

Balanced intelligence and performance
  • Faster responses with lower latency compared to GPT-4.1
  • Maintains support for up to 1 million tokens
  • Handles moderately complex instructions effectively
  • Suitable for applications requiring a balance between performance and cost
  • Can be used in Agentic AI Agents

GPT-4.1-nano

Lightweight and cost-effective
  • Fastest response times among the GPT-4.1 models
  • Supports up to 1 million tokens for context
  • Optimized for simple tasks like classification and autocomplete
  • Not suitable for complex instruction following or tool integration
  • Best for applications where speed and cost are prioritized over advanced capabilities

GPT-4o

Higher intelligence
  • First for use with complex instructions in the system prompt of a Simple AI Agent. In an Agentic AI Agent, this model is used by default
  • Slower responses, higher latency
  • Better understanding
  • Higher accuracy, handles out of context queries better
  • Optimal for using with tools

GPT-4o Mini

Lower intelligence, budget-friendly
  • Faster responses, lower latency
  • Simple system prompt with less complex instructions
  • Limited performance with tools
  • Best for simple queries
  • Not supported in Agentic AI Agents

Realtime Models

Realtime models are optimized for low-latency, real-time conversational experiences. These models enable natural, fluid interactions with minimal delay.

OpenAI Realtime Models

GPT-Realtime

Latest OpenAI realtime model
  • Ultra-low latency for real-time voice conversations
  • Optimized for natural, fluid dialogue
  • Supports streaming responses with minimal delay
  • Ideal for voice-based AI agents and live customer interactions
  • Handles context switches and interruptions gracefully

GPT-Realtime-2025-08-28

Dated version of OpenAI realtime model
  • Specific snapshot of the realtime model from August 28, 2025
  • Provides consistency for applications requiring a fixed model version
  • Same low-latency capabilities as GPT-Realtime
  • Use when you need version stability and predictable behavior

GPT-4o-Realtime-Preview

Preview version of GPT-4o realtime capabilities
  • Combines GPT-4o intelligence with realtime processing
  • Enhanced reasoning capabilities in real-time scenarios
  • Better handling of complex, multi-turn conversations
  • Preview status means features and performance may evolve

Google Gemini Realtime Models

Gemini-2.0-Flash-Live-001

Google’s fast realtime model
  • Optimized for speed and low latency
  • Excellent for live voice interactions
  • Fast response times suitable for conversational AI
  • Supports multimodal inputs in real-time scenarios

Gemini-Live-2.5-Flash-Preview

Enhanced preview version of Gemini Live
  • Latest preview of Google’s realtime capabilities
  • Improved performance and features over 2.0
  • Better context understanding in live conversations
  • Preview status indicates ongoing improvements
Realtime models are specifically designed for voice-based AI agents and scenarios requiring immediate responses. They prioritize low latency and natural conversation flow over complex reasoning tasks.
No STT Required: Realtime models have built-in speech recognition capabilities and do not require a separate Speech-to-Text (STT) module. They can directly process voice input and work seamlessly with the platform’s TTS (Text-to-Speech) for output.
Choose realtime models when building AI agents that need to handle voice calls or live chat interactions where response speed is critical. For complex tool-calling or reasoning tasks, consider using GPT-4.1 or GPT-4o standard models.

Groq Models

Groq models are open-source models powered by Groq’s high-performance inference infrastructure, offering exceptional speed and cost-effectiveness.

Groq Llama 3.3 70B Versatile

High-performance open-source model
  • 70 billion parameters with optimized transformer architecture
  • 128K token context window for extensive context handling
  • Strong instruction following and tool use capabilities
  • Fast inference powered by Groq’s infrastructure

Groq Llama 3.1 8B Instant

Fast and budget-friendly open-source model
  • 8 billion parameters optimized for instant responses
  • 128K token context window
  • Ultra-fast inference for real-time applications
  • Ideal for applications requiring quick responses without complex reasoning
Groq models leverage Groq’s specialized LPU (Language Processing Unit) architecture to deliver exceptional inference speed, making them ideal for high-throughput applications and real-time use cases.
Choose Llama 3.3 70B Versatile for complex tasks requiring strong reasoning and tool use. Choose Llama 3.1 8B Instant when speed and cost are priorities and tasks are relatively straightforward.

Verbex Models

Verbex Bangla Mini

Bengali-specialized model
  • Optimized for Bengali language tasks and tool-calling scenarios
  • Handles Bengali text and tool interactions effectively
  • Suitable for building conversational agents that need to interact with external systems in Bengali
  • Faster than GPT-4o, GPT-4.1 model
  • Best for Bengali language tasks and tool-calling scenarios
Model Card: Verbex Bangla Mini

Custom Model

Verbex support OpenAI compatible custom LLM models. You can use any LLM that is supported by OpenAI compatible models. Before adding a custom model to Verbex, keep in mind that:
  • The model should be OpenAI compatible.
  • The model should support streaming responses.
  • The model should support tool calling.
For AI Agents that use tools (like calendar booking, email sending, etc.), GPT-4o is strongly recommended. GPT-4o Mini has limited capabilities in handling tool interactions, which may result in unreliable tool execution and degraded performance.
We plan to add models like Anthropic Claude, Google Gemini, and more in the future.

3. Select STT

Select the STT module that will convert user speech into text.
The STT module is crucial for accurate transcription of customer speech, directly impacting your AI Agent’s ability to understand and respond appropriately.

4. Select Voice

Choose a voice that represents your brand and resonates with your audience.
When selecting a voice, consider:
  • Language compatibility
  • Gender preference
  • Accent appropriateness
  • Speaking style
  • Brand alignment

Configuration Best Practices

Language & Region

Match your target market’s primary language and regional preferences

Model Selection

Balance performance needs with budget constraints

Voice Choice

Align voice characteristics with brand identity

STT Accuracy

Test STT performance with your typical use cases

Performance Considerations

Model selection significantly impacts:
  • Response speed
  • Reasoning ability of the AI Agent
  • Handling out of context queries
  • Overall user experience

Model Comparison

ModelPerformanceCostBest ForTool Calling
GPT-4.1HighestHighestComplex tasks, extensive context (1M tokens), agentic AIExcellent
GPT-4.1-miniHighModerateBalanced performance and cost, agentic AIExcellent
GPT-4.1-nanoModerateLowSimple tasks, classification, autocompleteLimited
GPT-4oHighHigherComplex interactions, tool-based operationsExcellent
GPT-4o MiniModerateLowerBasic queries without toolsLimited
GPT-RealtimeHighHigherReal-time voice conversations, live interactionsExcellent
GPT-4o-Realtime-PreviewHighHigherReal-time with enhanced reasoningExcellent
Groq Llama 3.3 70BHighLowComplex reasoning, coding, tool use (cost-effective)Excellent
Groq Llama 3.1 8BModerateLowestFast responses, simple tasks (ultra cost-effective)Good
If your AI Agent will be using any tools or integrations, choose GPT-4o to ensure reliable tool calling performance. While GPT-4o Mini is cost-effective, it’s best suited for simple conversation-only scenarios.
Learn more about Tools Configuration →