Configure Your AI Agent

Proper configuration of your AI Agent is crucial for optimal performance. Each setting affects how your Agent communicates and processes information.

Core Settings

1. Select Language

Choose the primary language for your AI Agent’s interactions.

Select a language that matches your target audience. This affects both understanding and response capabilities.

2. Select Model

Choose the AI model that will power your Agent’s intelligence.

OpenAI Models

GPT-4.1

Highest intelligence

Ideal for complex tasks and advanced instruction following
Supports up to 1 million tokens for extensive context handling
Excels in reasoning, and long-context comprehension
Suitable for use in Agentic AI Agents
Best choice for tool integration and sophisticated applications

GPT-4.1-mini

Balanced intelligence and performance

Faster responses with lower latency compared to GPT-4.1
Maintains support for up to 1 million tokens
Handles moderately complex instructions effectively
Suitable for applications requiring a balance between performance and cost
Can be used in Agentic AI Agents

GPT-4.1-nano

Lightweight and cost-effective

Fastest response times among the GPT-4.1 models
Supports up to 1 million tokens for context
Optimized for simple tasks like classification and autocomplete
Not suitable for complex instruction following or tool integration
Best for applications where speed and cost are prioritized over advanced capabilities

GPT-4o

Higher intelligence

First for use with complex instructions in the system prompt of a Simple AI Agent. In an Agentic AI Agent, this model is used by default
Slower responses, higher latency
Better understanding
Higher accuracy, handles out of context queries better
Optimal for using with tools

GPT-4o Mini

Lower intelligence, budget-friendly

Faster responses, lower latency
Simple system prompt with less complex instructions
Limited performance with tools
Best for simple queries
Not supported in Agentic AI Agents

Realtime Models

Realtime models are optimized for low-latency, real-time conversational experiences. These models enable natural, fluid interactions with minimal delay.

OpenAI Realtime Models

GPT-Realtime

Latest OpenAI realtime model

Ultra-low latency for real-time voice conversations
Optimized for natural, fluid dialogue
Supports streaming responses with minimal delay
Ideal for voice-based AI agents and live customer interactions
Handles context switches and interruptions gracefully

GPT-Realtime-2025-08-28

Dated version of OpenAI realtime model

Specific snapshot of the realtime model from August 28, 2025
Provides consistency for applications requiring a fixed model version
Same low-latency capabilities as GPT-Realtime
Use when you need version stability and predictable behavior

GPT-4o-Realtime-Preview

Preview version of GPT-4o realtime capabilities

Combines GPT-4o intelligence with realtime processing
Enhanced reasoning capabilities in real-time scenarios
Better handling of complex, multi-turn conversations
Preview status means features and performance may evolve

Google Gemini Realtime Models

Gemini-2.0-Flash-Live-001

Google’s fast realtime model

Optimized for speed and low latency
Excellent for live voice interactions
Fast response times suitable for conversational AI
Supports multimodal inputs in real-time scenarios

Gemini-Live-2.5-Flash-Preview

Enhanced preview version of Gemini Live

Latest preview of Google’s realtime capabilities
Improved performance and features over 2.0
Better context understanding in live conversations
Preview status indicates ongoing improvements

Realtime models are specifically designed for voice-based AI agents and scenarios requiring immediate responses. They prioritize low latency and natural conversation flow over complex reasoning tasks.

No STT Required: Realtime models have built-in speech recognition capabilities and do not require a separate Speech-to-Text (STT) module. They can directly process voice input and work seamlessly with the platform’s TTS (Text-to-Speech) for output.

Choose realtime models when building AI agents that need to handle voice calls or live chat interactions where response speed is critical. For complex tool-calling or reasoning tasks, consider using GPT-4.1 or GPT-4o standard models.

Groq Models

Groq models are open-source models powered by Groq’s high-performance inference infrastructure, offering exceptional speed and cost-effectiveness.

Groq Llama 3.3 70B Versatile

High-performance open-source model

70 billion parameters with optimized transformer architecture
128K token context window for extensive context handling
Strong instruction following and tool use capabilities
Fast inference powered by Groq’s infrastructure

Groq Llama 3.1 8B Instant

Fast and budget-friendly open-source model

8 billion parameters optimized for instant responses
128K token context window
Ultra-fast inference for real-time applications
Ideal for applications requiring quick responses without complex reasoning

Groq models leverage Groq’s specialized LPU (Language Processing Unit) architecture to deliver exceptional inference speed, making them ideal for high-throughput applications and real-time use cases.

Choose Llama 3.3 70B Versatile for complex tasks requiring strong reasoning and tool use. Choose Llama 3.1 8B Instant when speed and cost are priorities and tasks are relatively straightforward.

Verbex Models

Verbex Bangla Mini

Bengali-specialized model

Optimized for Bengali language tasks and tool-calling scenarios
Handles Bengali text and tool interactions effectively
Suitable for building conversational agents that need to interact with external systems in Bengali
Faster than GPT-4o, GPT-4.1 model
Best for Bengali language tasks and tool-calling scenarios

Model Card: Verbex Bangla Mini

Custom Model

Verbex support OpenAI compatible custom LLM models. You can use any LLM that is supported by OpenAI compatible models. Before adding a custom model to Verbex, keep in mind that:

The model should be OpenAI compatible.
The model should support streaming responses.
The model should support tool calling.

For AI Agents that use tools (like calendar booking, email sending, etc.), GPT-4o is strongly recommended. GPT-4o Mini has limited capabilities in handling tool interactions, which may result in unreliable tool execution and degraded performance.

We plan to add models like Anthropic Claude, Google Gemini, and more in the future.

3. Select STT

Select the STT module that will convert user speech into text.

The STT module is crucial for accurate transcription of customer speech, directly impacting your AI Agent’s ability to understand and respond appropriately.

4. Select Voice

Choose a voice that represents your brand and resonates with your audience.

When selecting a voice, consider:

Language compatibility
Gender preference
Accent appropriateness
Speaking style
Brand alignment

Configuration Best Practices

Language & Region

Match your target market’s primary language and regional preferences

Model Selection

Balance performance needs with budget constraints

Voice Choice

Align voice characteristics with brand identity

STT Accuracy

Test STT performance with your typical use cases

Performance Considerations

Model selection significantly impacts:

Response speed
Reasoning ability of the AI Agent
Handling out of context queries
Overall user experience

Model Comparison

Model	Performance	Cost	Best For	Tool Calling
GPT-4.1	Highest	Highest	Complex tasks, extensive context (1M tokens), agentic AI	Excellent
GPT-4.1-mini	High	Moderate	Balanced performance and cost, agentic AI	Excellent
GPT-4.1-nano	Moderate	Low	Simple tasks, classification, autocomplete	Limited
GPT-4o	High	Higher	Complex interactions, tool-based operations	Excellent
GPT-4o Mini	Moderate	Lower	Basic queries without tools	Limited
GPT-Realtime	High	Higher	Real-time voice conversations, live interactions	Excellent
GPT-4o-Realtime-Preview	High	Higher	Real-time with enhanced reasoning	Excellent
Groq Llama 3.3 70B	High	Low	Complex reasoning, coding, tool use (cost-effective)	Excellent
Groq Llama 3.1 8B	Moderate	Lowest	Fast responses, simple tasks (ultra cost-effective)	Good

If your AI Agent will be using any tools or integrations, choose GPT-4o to ensure reliable tool calling performance. While GPT-4o Mini is cost-effective, it’s best suited for simple conversation-only scenarios.

Learn more about Tools Configuration →

Getting Started

Build Your AI Agent - Step by Step

Navigating within Verbex

Batch Call

Monitor Your AI Agent

Register & Handle Webhooks

Resources

Configure Your AI Agent

Core Settings

1. Select Language

2. Select Model

OpenAI Models

GPT-4.1

GPT-4.1-mini

GPT-4.1-nano

GPT-4o

GPT-4o Mini

Realtime Models

OpenAI Realtime Models

GPT-Realtime

GPT-Realtime-2025-08-28

GPT-4o-Realtime-Preview

Google Gemini Realtime Models

Gemini-2.0-Flash-Live-001

Gemini-Live-2.5-Flash-Preview

Groq Models

Groq Llama 3.3 70B Versatile

Groq Llama 3.1 8B Instant

Verbex Models

Verbex Bangla Mini

Custom Model

3. Select STT

4. Select Voice

Configuration Best Practices

Language & Region

Model Selection

Voice Choice

STT Accuracy

Performance Considerations

Model Comparison

Getting Started

Build Your AI Agent - Step by Step

Navigating within Verbex

Batch Call

Monitor Your AI Agent

Register & Handle Webhooks

Resources

​Core Settings

​1. Select Language

​2. Select Model

​OpenAI Models

​GPT-4.1

​GPT-4.1-mini

​GPT-4.1-nano

​GPT-4o

​GPT-4o Mini

​Realtime Models

​OpenAI Realtime Models

​GPT-Realtime

​GPT-Realtime-2025-08-28

​GPT-4o-Realtime-Preview

​Google Gemini Realtime Models

​Gemini-2.0-Flash-Live-001

​Gemini-Live-2.5-Flash-Preview

​Groq Models

​Groq Llama 3.3 70B Versatile

​Groq Llama 3.1 8B Instant

​Verbex Models

​Verbex Bangla Mini

​Custom Model

​3. Select STT

​4. Select Voice

​Configuration Best Practices

Language & Region

Model Selection

Voice Choice

STT Accuracy

​Performance Considerations

​Model Comparison

Core Settings

1. Select Language

2. Select Model

OpenAI Models

GPT-4.1

GPT-4.1-mini

GPT-4.1-nano

GPT-4o

GPT-4o Mini

Realtime Models

OpenAI Realtime Models

GPT-Realtime

GPT-Realtime-2025-08-28

GPT-4o-Realtime-Preview

Google Gemini Realtime Models

Gemini-2.0-Flash-Live-001

Gemini-Live-2.5-Flash-Preview

Groq Models

Groq Llama 3.3 70B Versatile

Groq Llama 3.1 8B Instant

Verbex Models

Verbex Bangla Mini

Custom Model

3. Select STT

4. Select Voice

Configuration Best Practices

Performance Considerations

Model Comparison