Short version: AgentCall now supports 13 spoken languages on every AI voice call. English, Spanish, French, German, Italian, Portuguese, Dutch, Japanese, Korean, Chinese (Mandarin), Hindi, Arabic, plus an auto-detect mode that matches whoever is on the other end of the phone. Same per-minute pricing. Works on inbound (your number answers in the caller's language) and outbound (your AI agent dials in any language you pick). This post walks through the two modes, shows a real bilingual outbound demo, and explains why pinning a language gives you cleaner results than letting the model guess on every call.
The 13 languages
Every AI voice call accepts an optional language field. Pass any of these ISO-639-1 codes (or auto):
auto: let the AI detect the caller's language from their first reply and match it. Default behavior. Best for inbound receptionists where you don't know who is calling.en: Englishes: Spanishfr: Frenchde: Germanit: Italianpt: Portuguesenl: Dutchja: Japaneseko: Koreanzh: Chinese (Mandarin)hi: Hindiar: Arabic
All ten of the available voices (alloy, ash, ballad, cedar, coral, echo, marin, sage, shimmer, verse) sound natural in every supported language. You don't pick a separate voice per language. The same Marin voice that greets your English callers handles Japanese and Spanish on the same call.
Two modes: auto-detect or pinned
The language field has two operating modes. The right pick depends on whether you know what language your recipient speaks.
Mode 1: auto-detect (default)
Pass language: "auto" or omit the field entirely. The model listens to the caller's first reply, detects the language, and responds in kind. If the caller switches mid-call (e.g. starts in English and switches to Spanish), the model follows.
This is the right pick for inbound receptionists. You don't know who is calling. A Spanish-speaking customer should not have to ask “do you speak Spanish?” before getting service.
# Inbound: receptionist that auto-matches the caller's language
curl -X POST https://api.agentcall.co/v1/numbers/num_abc123/inbound-config \
-H "Authorization: Bearer ac_live_xxx" \
-H "Content-Type: application/json" \
-d '{
"mode": "ai",
"systemPrompt": "You are the front desk for Acme Plumbing...",
"voice": "marin",
"language": "auto"
}'Mode 2: pinned
Pass an ISO code like language: "es". The model responds onlyin Spanish for the entire call, even if the caller speaks English. Under the hood, the runtime prepends a “respond exclusively in Spanish” directive to your system prompt at session-build time, and the speech-to-text layer pins its detection hint to Spanish so transcription accuracy stays high.
This is the right pick for outbound calls where you know the recipient's language. A Spanish-speaking dental practice calling Spanish-speaking patients should not roll the dice on auto-detect.
# Outbound: pin to Spanish for a Hispanic patient outreach campaign
curl -X POST https://api.agentcall.co/v1/calls/ai \
-H "Authorization: Bearer ac_live_xxx" \
-H "Content-Type: application/json" \
-d '{
"from": "num_abc123",
"to": "+15125550100",
"systemPrompt": "Eres el asistente de programación del consultorio del Dr. Paul...",
"firstMessage": "Hola, le habla la oficina del Dr. Paul sobre su próxima limpieza dental.",
"voice": "marin",
"language": "es"
}'The firstMessage field stays verbatim. If you want a Spanish greeting, write it in Spanish yourself. The language setting governs the model's replies after that, not the scripted opener.
A real bilingual demo: Dr. Paul's dental office
Here is the saved outbound agent we ship on our demo number. It opens in English. If the patient replies in Spanish, the AI switches fully to Spanish for the rest of the call. No configuration on the recipient side. No mid-call menu prompt asking “press 1 for English, 2 for Spanish.”
{
"from": "num_dr_paul_demo",
"to": "+13145550100",
"useSavedAgent": true
}That single API call dials the patient, plays a warm English opener (“Hi, this is Dr. Paul's office calling about your next dental cleaning. Do you have a quick moment to schedule?”), and if the patient responds in Spanish (“Hola, ¿pueden hablar en español?”), the model switches:
“¡Claro que sí! Tenemos disponibilidad el martes a las diez de la mañana o el miércoles a las dos y media. ¿Cuál le funciona mejor?”
The patient picks a slot in Spanish. The AI confirms in Spanish. The call ends with a Spanish wrap-up. Total wall-clock time from API call to confirmed appointment: about 90 seconds.
The saved agent uses language: "auto" because we want either-language behavior. If you know your patient base is 100% Spanish-speaking, you would setlanguage: "es" and the AI would open in Spanish too.
CSV runners: dial 500 recipients in their own language
The interesting use case for pinned language is batch outbound. If you have a CSV of recipients with a known language column, you can dial each one with the right language pinned per row, using the existing useSavedAgent +idempotencyKey pattern from our voice prompts guide:
import csv, requests, time
API = "https://api.agentcall.co/v1/calls/ai"
HEADERS = {"Authorization": "Bearer ac_live_xxx", "Content-Type": "application/json"}
NUMBER_ID = "num_abc123" # your number with a saved bilingual agent
for row in csv.DictReader(open("patients.csv")):
requests.post(API, headers=HEADERS, json={
"from": NUMBER_ID,
"to": row["phone"],
"useSavedAgent": True,
"language": row["language"], # 'en' or 'es' per row
"firstMessage": (
f"Hi {row['name']}, this is Dr. Paul's office about your refill."
if row["language"] == "en"
else f"Hola {row['name']}, le habla la oficina del Dr. Paul sobre su receta."
),
"idempotencyKey": f"refill-2026-05-26:{row['patient_id']}",
})
time.sleep(3) # be polite to recipientsPer-call overrides win, so the saved agent provides the system prompt + voice + recording flag, the row supplies the language and the personalized greeting, and theidempotencyKey makes the loop crash-safe.
Why pin instead of auto-detect when you can?
Auto-detect is great for inbound. For outbound where you know the recipient's language, pinning produces cleaner calls for three reasons:
- No first-reply roulette. Auto-detect needs the recipient to speak first. On a cold outbound call, the model picks language from the patient's “hello” or “hola.” A noisy one-word reply can lead to a wrong guess, and the patient hears the next 30 seconds in the wrong language before the model self-corrects.
- Better speech-to-text accuracy. Pinning the language hint tells Whisper which model to bias toward, which matters more for accented or low-volume audio. Cleaner transcripts also mean cleaner post-call summaries and structured analysis.
- The greeting matches the language. Your
firstMessageshould be in the target language. Auto-detect can't change the greeting because the greeting is scripted. Pinning makes it natural to write the greeting in the right language up front.
Pricing
Multilingual AI voice is included on every AI voice plan at no extra charge. Managed pricing stays $0.40 per minute on Pro. BYOK (bring your own OpenAI key) stays $0.10 per minute. The same per-minute rate covers calls in any of the 13 languages. There is no separate language surcharge, no per-call fee for switching languages, no extra cost for the auto-detect listener.
Free tier customers get five inbound AI voice minutes per month at no cost (no card required) and can configure those minutes in any of the 13 languages. Outbound AI voice stays Pro-only.
What this competes with
Vapi, Retell, and Bland support multiple languages on their AI voice platforms. The differences come down to:
- Bundled phone numbers. Vapi and Retell ask you to bring your own Twilio number. AgentCall provides the phone number, the AI voice, and the cross-call memory in one product. One bill, one dashboard, one API key.
- Per-number saved agents. AgentCall lets you save an agent persona on a phone number so the dashboard and your batch runner reuse it without re-spreading the prompt every call. Vapi and Retell expose the assistant as a separate resource you have to reference per call. See our voice prompts guide for the saved-agent pattern.
- Idempotent retries. AgentCall's
idempotencyKeyon outbound calls makes CSV runners crash-safe. Most voice platforms make you reconcile duplicates after the fact. - Same pricing across languages. Some platforms charge premium for non-English voices or specific TTS providers. AgentCall's per-minute rate is flat across the 13 supported languages.
Get started
If you already have an AgentCall account, every existing inbound config and every outbound AI call already supports the language field. No migration needed. Try it now:
# Switch an inbound receptionist to Spanish-only
curl -X PATCH https://api.agentcall.co/v1/numbers/num_abc123 \
-H "Authorization: Bearer ac_live_xxx" \
-H "Content-Type: application/json" \
-d '{ "language": "es" }'Or the MCP equivalent for Claude Desktop, Cursor, OpenClaw users:
update_number_language(numberId="num_abc123", language="es")If you're new, you can sign up free at agentcall.co and get one US local number plus five inbound AI minutes a month at no cost, all 13 languages supported.
For the full system prompt structure that prevents AI hallucination on phone calls (across any language), read how to write a system prompt for AI voice calls. For the saved-agent + CSV runner pattern that powers batch outbound, see the Hermes pattern section. For the broader question of why AI agents benefit from owning their own phone identity, read why AI agents need phone numbers.