Documentation Index
Fetch the complete documentation index at: https://docs.firstwork.com/llms.txt
Use this file to discover all available pages before exploring further.
The AI Caller lets admins deploy AI-powered voice bots that can conduct phone calls, in-browser audio sessions, or video sessions with candidates. Each bot follows a configurable conversation script, captures structured data during the call, and feeds results back into hiring workflows and automations.
Creating an AI Caller Bot
Each bot is configured with:
Conversation Design
| Setting | Description |
|---|
| System Prompt | The main conversation script that guides the bot’s behavior, questions, and tone |
| System Instructions | High-level behavioral instructions for the bot |
| Input Variables | Data passed to the bot before the call starts (e.g., candidate name, application details) — resolved from the candidate’s data |
| Output Variables | Structured data the bot should capture during the conversation (e.g., availability, salary expectations, qualifications) |
Output Variable Types
Each output variable specifies what kind of data to capture:
| Type | Description |
|---|
| Text | Free-form text response |
| Number | Numeric value — supports subtypes: Default, Currency, Percentage, Days, Months, Years |
| Selection | One or more choices from a predefined list (preset options or custom options, with optional multi-select) |
| Date | A date value, with optional restrictions on past or future dates |
| Email | An email address |
| Phone Number | A phone number |
Each variable also includes a capture prompt — an instruction that tells the bot exactly what to listen for.
Voice & Audio Settings
| Setting | Description |
|---|
| Voice | Choose from 10 voices: Alloy, Ash, Ballad, Cedar, Coral, Echo, Sage, Shimmer, Verse, Marin |
| Voice Speed | Adjust the speaking rate |
| Maximum Call Duration | Set a time limit (in seconds) for the conversation |
| Short Drop Duration | Minimum call length (in seconds) before the call is considered a short/dropped call (default: 10s) |
| Profile Picture | An avatar displayed to the candidate during audio/video sessions |
Realtime Model
Choose which GPT Realtime model powers the conversation:
| Model | Description |
|---|
| GPT Realtime | Standard realtime model |
| GPT Realtime Mini | Lightweight model — faster responses, lower cost |
| GPT Realtime 1.5 | Latest generation model (default) — best quality and latency |
Turn Detection
Controls how the bot recognizes when the candidate has finished speaking:
| Mode | Description |
|---|
| Voice Activity Detection | Detects silence to determine speech boundaries — configurable threshold (0–1, default 0.65), prefix padding (ms), silence duration (ms), and idle timeout (ms) |
| Semantic Detection | Uses AI to understand conversational flow — configurable eagerness level (Low, Medium, High) |
Smooth Barge-in
When enabled, the bot uses local Silero-based VAD to detect when a candidate speaks over the bot, making the mid-sentence barge in feel more natural. Disabled by default.
Summarization
A summary of the conversation is automatically generated after every call. A custom summary prompt can be provided to control what aspects of the conversation are highlighted.
Conversation History
When enabled, the bot includes summaries from prior calls with the same candidate in its context — useful for multi-stage workflows or follow-up calls.
Call Recording
Call recording requires a feature flag to be enabled for your account. Once enabled, recordings are stored securely and linked to the enrollment record. Even without the feature flag, recording can be enabled for individual test calls to review bot behaviour before going live.
Data Access
The bot can optionally be granted access to the candidate’s full data profile, allowing it to reference application details, documents, and other information during the conversation.
Bots can invoke automations during a live call. Each linked automation (a Caller Bot Automation) is exposed to the AI as a callable tool.
| Setting | Description |
|---|
| Automation | The automation to invoke |
| Slug | Unique identifier for this tool (must be unique per bot) |
| Tool Description | Plain-language description of when the AI should invoke this tool |
| Payload Schema | Fields the AI must collect before invoking (each field has a name, type, and description) |
| Enabled | Toggle the tool on or off without removing the configuration |
The AI decides when to invoke a tool based on the conversation context and tool descriptions. Automations run asynchronously — the call continues while the automation executes.
Call Types
| Type | Description |
|---|
| Phone | An outbound phone call to the candidate’s number, placed via telephony provider |
| Audio | An in-browser audio session — the candidate speaks through their device microphone |
| Video | An in-browser video session — includes camera and microphone |
How Calls Are Triggered
| Method | Description |
|---|
| From a Hiring Flow Form | An AI Caller element is embedded in a form page — the call is initiated when the candidate reaches that step |
| From an Automation | The “AI Caller” automation action enrolls a candidate in a call |
| Manual Test | Admins can test the bot directly from the setup interface |
When an AI Caller is embedded within a hiring flow form, admins configure form mappings:
- Input mappings connect data from the candidate’s profile to the bot’s input variables
- Output mappings connect the bot’s captured responses to specific form fields
This means the bot’s conversation can both read from and write to the candidate’s application data.
If the form element is configured as blocking, the candidate cannot proceed to the next page until the call is completed.
Call Logs & Monitoring
Every call generates a detailed enrollment record. Admins can review calls across three views:
- Test Logs — Calls initiated from the admin test interface
- Automation Logs — Calls triggered by automations
- Application Logs — Calls triggered during form submission
Log Filters
| Filter | Options |
|---|
| Search | By candidate name |
| Call Type | Phone, Audio, Video |
| Call Status | Initiated, In Progress, Completed, Failed |
Call Detail
Each call record includes:
| Section | Content |
|---|
| Summary | Status, duration, and count of captured answers |
| Recording | Audio playback of the conversation (requires feature flag, or enabled per test call) |
| Transcript | Full text of the conversation |
| Conversation Summary | AI-generated summary of the conversation |
| Candidate Details | Linked candidate profile information |
| Output Values | All captured variable values |
Admins can re-trigger a test call or re-compute captured answers from the transcript.
Call Statuses
| Status | Description |
|---|
| Initiated | Call is queued but not yet started |
| In Progress | Call is currently active |
| Completed | Call finished successfully |
| Failed | Call could not be completed due to an error |
| Short Drop | Call ended before the minimum call duration threshold |
| Identity Mismatch | The person who answered could not be verified as the intended candidate |
| Reschedule Required | The candidate requested to be called back at a different time |
| Connection Error | Call could not be established due to a telephony or network issue |
| User Unreachable | Candidate did not answer — phone was busy, went to voicemail, or timed out |
| Timed Out | Call exceeded the maximum allowed duration |